The simple linear regression models the straight-line relationship between the dependent variable and the independent variable.

In a multiple linear regression, we may determine the effect of two or more independent variables (also called explanatory variables or regressors) on a particular dependent variable. In other words, we use more than one independent variable to make predictions about the dependent variable.

In order to compute the predicted value of the dependent variable, we obtain estimates of the regression coefficients (using matrix algebra) and assume values of the independent variables. A slope coefficient gives the estimate of change in the dependent variable, if the independent variable changes by one unit, holding all other independent variables constant.

We may also test hypotheses about the relation between these variables in addition to quantify the strength of the relationship.

We may run regression using two primary types of data: time series and cross-sectional.

Cross-sectional data engage many observations (relating to different asset classes, companies, people, countries or other entities) on X and Y for the same time period.

Whereas, time-series data involve many observations from different time periods for the same asset class, company, person, country or other entity. A mix of time-series and cross-sectional data is known as panel data.

**Equation : Y**_{i} = b _{0} + b _{1}x _{li} + b _{2}x _{2i} + . . . + b _{k}x _{ki} + ∈ _{i}, i = 1,2, ... n .

Where,

**
**
**Y**_{i} | = | the i^{th} observation of the dependent variable Y |

**x **_{ji} | = | the i^{th} observation of the independent variable **x **_{j} , j = 1 , 2, ... , k |

**b **_{0} | = | the intercept of equation |

**b **_{1} , . . . . b _{k} | = | the slope coefficients for each of the independent variables |

**∈ **_{i} | = | the error term |

**n** | = | the number of observations |

Assumptions:

1- Linear Relationship between Y and X's variables | ExplanationThis assumption states that parameters (**b **_{0},b _{1} , . . . , b _{k} ) are raised to the first power only and neither **b **_{0} nor **b **_{1} , . . . , b _{k} is multiplied or divided by another parameter. Further, that linear regression is possible as long as the regression is linear in the parameters. |

2- The independent variables, **x **_{1}, x _{2}, .... , x _{k} are not random | ExplanationThis assumption is clearly often not true. For example, we frequently use return on benchmark stock indices as independent variables to explain any change in a particular stock, and it is unrealistic to assume that such return are not random.
Even, if the independent variables are random, we can still rely on the regression estimates given the important assumption that the error term is not correlated with the independent variables. |

Further, two or more independent variables do not have any exact linear relation | Try Our Tool |

3- The expected value of the error term is zero | |

4- The variance of the error term is the same for all observations | Try Hypo-Test |

5- The error term is uncorrelated across observations | Try Hypo-Test |

6- The error term is normally distributed | Try Hypo-TestFor large samples, we may be able to drop the normality assumption by appeal to the central limit theorem, which states that the sum (as well as the mean) of a large number of independent random variables is approximately normally distributed. However, we may also apply a normality test such as Anderson-Darling Test. |

If these assumptions are violated, the estimated regression coefficients (**b^**_{0}, b^_{1} . . . , b^_{k}) will be biased and inconsistent.