In order to predict the future behavior of a time series based on causal relationships with other time series, we may fit a linear regression. If a time series appears to have a unit root, it is not covariance stationary. If any time series contains a unit root and we model it through linear regression, OLS estimates of regression test statistics may be invalid.
How to model more than one time series in a linear regression?
To answer this question, let us start with two time series, one corresponding to the independent variable and one corresponding to the dependent variable. Then, we extend our discussion to multiple time series before to finish this overview.
We first run a unit root test, such as the Dickey–Fuller test, for each of the two time series to find out whether either of them has a unit root. Several possible scenarios relating to the outcome of these tests are covered through the below table:
Outcome | Null Hypothesis (Independent Variable) | Null Hypothesis (Dependent Variable) | Action | Cointegration Testing | Subsequent Action |
Reject | Fail to rejects | Reject | Fail to rejects |
1 | | | We can safely use linear regression to test the relations between the two time series. | | |
2 | | | In this case, the estimated regression coefficients and standard errors would be inconsistent, so we should not use linear regression. | | |
3 | | | In this case, the estimated regression coefficients and standard errors would be inconsistent, so we should not use linear regression. | | |
4 | | | In this case, we need to establish whether the two time series are cointegrated before we can rely on regression analysis. | If Cointegrated | In this case, the estimated regression coefficients and standard errors will be consistent and we may use linear regression but with caution. Because, the cointegrated regression estimates the long-term relation between the time series. |
If not Cointegrated | In this case, the estimated regression coefficients and standard errors would be inconsistent, so we should not use linear regression. |
| We reject the null hypothesis in favor of alternative hypothesis that the time series does not have a unit root and is stationary. |
| We fail to reject the null hypothesis that the time series has a unit root and is nonstationary. |
Now let us focus how we can test for cointegration between two time series that each have a unit root as in the fourth scenario above. Engle and Granger (1987) proposed this test: If x _{t} and y_{t} are both time series and contain a unit root, we should do the following.
1. Estimate the regression y_{t} = b _{0} + b _{1}x _{t} + ∈ _{t}.
2. Test whether the error term from the above regression has a unit root using a (Engle–Granger) Dickey–Fuller test.
We can formulate the following set of hypothesis.
H_{0}: Non-Cointegration Versus H_{a}: Cointegration
The slope t-statistic obtained from the regression output is a statistic of the (Engle–Granger) Dickey Fuller test, which is the basis for deciding whether or not to reject the null hypothesis, comparing it with the (Engle–Granger) Dickey Fuller critical points rather than Dickey Fuller critical values.
The comparison values we choose are based on the level of significance selected. The level of significance reflects how much sample evidence we require to reject the null. We can use three conventional significance levels to conduct hypothesis tests: 0.10, 0.05, and 0.01.
If we find that the calculated value of the test statistics is less than the (Engle–Granger) Dickey Fuller t-critical value, we reject the null hypothesis.
How to model more than two time series in a linear regression?
We now extend our discussion to multiple (three or more) time series
1- The simplest outcome is that none of the time series in the regression has a unit root. Then, we can safely use multiple regression model to test the relation among the time series.
2- Similar to above outcomes 2 and 3 in case of two time series, If at least one time series corresponding to the dependent variable or independent variables contains a unit root while at least one time series corresponding to the dependent variable or independent variables does not contain, the error term cannot be covariance stationary. Therefore, we should not use multiple linear regression model to analyze the relation among the time series.
3- Another possibility is that each time series in the regression has a unit root. In this case, we need to establish whether the time series are cointegrated.
The cointegration testing procedure is similar to that for a regression model with one independent variable.
1. Estimate the regression y_{t} + b _{0} + b _{1}x _{1t} + b _{2}x _{2t} + . . . + b _{k}x _{kt} + ∈ _{t}
2. Test whether the error term from the above regression has a unit root using a (Engle–Granger) Dickey–Fuller test.
If we fail to reject the null hypothesis that Non-Cointegration, the estimated regression coefficients and standard errors would be inconsistent, so we should not use linear regression.
If we reject the null hypothesis in favor of alternative hypothesis that Cointegration, the estimated regression coefficients and standard errors will be consistent and we may use linear regression but with caution. Because, the cointegrated regression estimates the long-term relation between the time series.