In order to predict the future behavior of a time series based on causal relationships with other time series, we may fit a linear regression. If a time series appears to have a unit root, it is not covariance stationary. If any time series contains a unit root and we model it through linear regression, OLS estimates of regression test statistics may be invalid.
How to model more than one time series in a linear regression?
To answer this question, let us start with two time series, one corresponding to the independent variable and one corresponding to the dependent variable. Then, we extend our discussion to multiple time series before to finish this overview.
We first run a unit root test, such as the Dickey–Fuller test, for each of the two time series to find out whether either of them has a unit root. Several possible scenarios relating to the outcome of these tests are covered through the below table:
Now let us focus how we can test for cointegration between two time series that each have a unit root as in the fourth scenario above. Engle and Granger (1987) proposed this test: If x t and yt are both time series and contain a unit root, we should do the following.
1. Estimate the regression yt = b 0 + b 1x t + ∈ t.
2. Test whether the error term from the above regression has a unit root using a (Engle–Granger) Dickey–Fuller test.
We can formulate the following set of hypothesis.
H0: Non-Cointegration Versus Ha: Cointegration
The slope t-statistic obtained from the regression output is a statistic of the (Engle–Granger) Dickey Fuller test, which is the basis for deciding whether or not to reject the null hypothesis, comparing it with the (Engle–Granger) Dickey Fuller critical points rather than Dickey Fuller critical values.
The comparison values we choose are based on the level of significance selected. The level of significance reflects how much sample evidence we require to reject the null. We can use three conventional significance levels to conduct hypothesis tests: 0.10, 0.05, and 0.01.
If we find that the calculated value of the test statistics is less than the (Engle–Granger) Dickey Fuller t-critical value, we reject the null hypothesis.
How to model more than two time series in a linear regression?
We now extend our discussion to multiple (three or more) time series
1- The simplest outcome is that none of the time series in the regression has a unit root. Then, we can safely use multiple regression model to test the relation among the time series.
2- Similar to above outcomes 2 and 3 in case of two time series, If at least one time series corresponding to the dependent variable or independent variables contains a unit root while at least one time series corresponding to the dependent variable or independent variables does not contain, the error term cannot be covariance stationary. Therefore, we should not use multiple linear regression model to analyze the relation among the time series.
3- Another possibility is that each time series in the regression has a unit root. In this case, we need to establish whether the time series are cointegrated.
The cointegration testing procedure is similar to that for a regression model with one independent variable.
1. Estimate the regression yt + b 0 + b 1x 1t + b 2x 2t + . . . + b kx kt + ∈ t
If we fail to reject the null hypothesis that Non-Cointegration, the estimated regression coefficients and standard errors would be inconsistent, so we should not use linear regression.
If we reject the null hypothesis in favor of alternative hypothesis that Cointegration, the estimated regression coefficients and standard errors will be consistent and we may use linear regression but with caution. Because, the cointegrated regression estimates the long-term relation between the time series.