The simple linear regression models the straight-line relationship between the dependent variable and the independent variable.
In a multiple linear regression, we may determine the effect of two or more independent variables (also called explanatory variables or regressors) on a particular dependent variable. In other words, we use more than one independent variable to make predictions about the dependent variable.
In order to compute the predicted value of the dependent variable, we obtain estimates of the regression coefficients (using matrix algebra) and assume values of the independent variables. A slope coefficient gives the estimate of change in the dependent variable, if the independent variable changes by one unit, holding all other independent variables constant.
We may also test hypotheses about the relation between these variables in addition to quantify the strength of the relationship.
We may run regression using two primary types of data: time series and cross-sectional.
Cross-sectional data engage many observations (relating to different asset classes, companies, people, countries or other entities) on X and Y for the same time period.
Whereas, time-series data involve many observations from different time periods for the same asset class, company, person, country or other entity. A mix of time-series and cross-sectional data is known as panel data.
Equation : Yi = b 0 + b 1x li + b 2x 2i + . . . + b kx ki + ∈ i, i = 1,2, ... n .
1- Linear Relationship between Y and X's variables
2- The independent variables, x 1, x 2, .... , x k are not random
Further, two or more independent variables do not have any exact linear relation
3- The expected value of the error term is zero
4- The variance of the error term is the same for all observations
5- The error term is uncorrelated across observations
6- The error term is normally distributed
If these assumptions are violated, the estimated regression coefficients (b^0, b^1 . . . , b^k) will be biased and inconsistent.
Suppose the Fidelity Select Technology Fund (FSPTX) is under consideration to our one investment strategy. FSPTX, a US mutual fund, is an actively managed portfolio specializing in the domestic technology stocks.
We want to know whether the FSPTX behaves more like a large-cap growth fund or a large-cap value fund. Such return-based style analysis is one of the most frequent applications of regression analysis in the investment field.
Using monthly data from January 2012 through December 2016 (shown as default values in our calculator), we include S&P 500 Growth Index (SGX) and S&P 500 Value Index (SVX) as independent variables in a regression.
The equation to be estimated is:
Yt = b 0 + b 1x 1t + b 2x 2t + ∈ t
Yt = the monthly return to the FSPTX.
x 1t = the monthly return to the S&P 500 Growth Index.
x 2t = the monthly return to the S&P 500 Value Index.
Create a regression model to know how much variation in the FSPTX is explained by the S&P 500 Growth and Value indices between 2012 and 2016.
Also predict the return to the FSPTX if the return to the SGX is assumed to be 1% and return to the SVX is assumed to be -1% in a given month.
Assume X Variables