Autocorrelation

Autocorrelation
Recap:
Regression equation
Yi  1   2 X i   i
Yi depends on X i and  i
If Xi and
 i varies in generation then no way we can make any statistical
inference about Yi and also  1 and 2
The assumption made on variables Xi and error term i are critical to the valid
interpretation of the regression estimates.
Assumption about autocorrelation
No autocorrelation between the disturbances. Given any two values X i and X j , the
correlation between any two i and j is zero => Cov  , 

i j 
Xi, X j  0
However in some cases;
Correlation between members of the series observation ordered in time (time series
data) i.e quarterly time series data for regression output on labour and capital inputs. If
there is labour strike affecting output in one quarterly there should be no reason for
disruption will be carried over to the next quarterly.
Or in cross-sectional data=> correlation between members of series observation
ordered in space i.e regression of family consumption expenditure is not expected to
affect the consumption expenditure of another family.
If this situation occurs then we have autocorrelation. Autocorrelation is often found in

time series data.
Nature of autocorrelation
1. Inertia: most common feature of economic time series data (i.e GNP, price
indexes, production, employment, unemployment etc. => Business cycles)
2. Specification bias: Excludes variables case; researcher starts with plausible
regression model which may not be the most perfect. Consider the following
example, the true regression id given by
Yt  1   2 X 2t   3 X 3t   4 X 4t   t
But researcher uses the following regression equation
Yt  1   2 X 2t   3 X 3t   t
Where  t   4 X 4t   t
3. Specification bias; Incorrect function form
Consider the true/correct model in a cost-output study is
MCosti  1   2Outputi   3Outputi2  i
but we fit the following regression function
MCosti  1   2Outputi   i
4. Cobweb phenomenon
Supply reacts to price with a lag of one time period (supply decision take time
to implement)
Supply t  1   2 Pt 1   t
5. Lags
Example: current consumption depends on consumptions on the previous
period
Consumptiont  1   2 Incomet   3Consumptiont 1  t
6. Manipulation of data
Examples
1. Time series data involving quarterly data derived from monthly data (finding
average of the three month)
2. Interpolation or extrapolation of data; census data interpolate on the basis of
some ad hoc assumptions.
The techniques used in manipulation of data might impose upon the data a
systematic pattern which might not exist in the original data.
Estimation in the levels versus first differences
Consider the following model: Yt  1   2 X t   t
Where, Y Consumption expenditure and X income. Assuming the model hold true
at a given time, in previous time period ( t  1 ) model is presented by:
Yt 1  1   2 X t 1   t 1
Where Yt 1 X t 1 and U t 1 known as lagged values of Y X and 

respectively.
By combining the two models:
Yt   2 X t   t => Yt   2 X t   t

Where,  t  t  t  t 1 
Note:
Y  1   2 X t 1  t 1 known as level form.

The equation t 1
While, the equation Yt   2 X t   t known as (first) difference

form.
Effect of AR(1) Error on OLS estimates
Recall: Yt  1   2 X t  t
Consider model has satisfied other OLS assumption and we introduce autocorrelation
into disturbance term.
E ( t , t s )  0 ( s  0)
Then
t  t 1   t 1    1
Where
 coefficient of auto-covariance
t Stochastic disturbance term
Under OLS assumption
E ( t )  0
Var ( t )   2
Cov( t ,  t s )  0 ( s  0)
Note:
t  t 1   t known as Markov first-order
autoregressive scheme (first-order autoregressive scheme) noted by AR1
(first-order because regress

t and its immediate past value are involved (max.
lag is 1)
If the model were
t  t 1  t 2   t AR(2)

Second-order autoregressive scheme
Third-order
t  t 1  t 2  t 3   t AR(3)

Pth-order
t  t 1  ...  t  p   t
From
t  t 1   t
 (rho) Coefficient of auto-covariance is interpreted as the first-order
coefficient of autocorrelation (the coefficient of autocorrelation at lag 1)
Et  E ( t ) 
. t 1  E ( t 1 ) 

Var ( t ) . Var ( t 1 )
E t 
. t 1 ) 

Var ( t 1 )
Since for each
E (t )  0 for each t
Var (t )  Var (t 1 ) Retain assumption of homoscedasticity

Given AR(1) scheme
 2
Var ( t )  E ( t2 )  
1 2
 2
Cov( t , t  s )  E ( t .t  s )   s 
1  2
Cor( t , t  s )   s
Where Cov( t , t  s ) means covariance between error terms s periods apart
and where Cor( t , t  s ) means correlation between error terms s periods apart
Consider two variable regression model
Yt  1   2 X t   t
OLS estimators of slope coefficient  2  

 xt yt
x 2
t
 2
Its variance is given by  2  
x 2
t
Under AR(1) scheme; variance of this estimator is given by
  
Var   2  AR (1) 
2 
 1  2
x x
t t 1
 ....  2  n1
x1 xn 

   xt2  x 2
t  xt2 

   
Where Var   2  AR (1) means variance of  2 under first-order autoregressive
 
Note:   0 The two equations coincide.
Assuming regressor X also follows the first-order autoregressive scheme with
coefficient of autocorrelation r of then

 2  1  r 
Var (  2 ) AR (1)   
 xt2  1  r 

  1  r 
 Var (  2 )OLS 
 1  r 

 
Example:
Assume that r=0.6 and   0.8
Note: when there is autocorrelation the usual OLS will underestimate the variance

(  2 ) AR (1)
Durbin-Watson test
The presence of first order autocorrelation is tested by utilizing the table of Durbin-
Watson statistics at the 5 of 1% level of significance for n observation and k
explanatory variables. The Durbin-Watson statistics calculated as the ratio of the sum
of square differences in successive residual to the residual sum of squares (RSS)
 
 (  
t n
t 2
)2
d t
t n  2
t 1

t 1
t
d-statistics based on the estimated residuals which are computed in regression

analysis.
Assumptions underlying the d-statistics are:-
1. The regression model includes the intercept term

2. The explanatory variables, the X’s, are non-stochastic, or fixed in repeated
sampling
3. The disturbances ut are generated by the first-order autoregressivescheme
t  t 1   t
4. The error term ut is assumed to be normally distributed
5. The regression model does not include the lagged value(s) of the
dependentvariable as one of the explanatory variables
6. There are no missing observations in the data
Estimation of d-statistics
d
      2  
t
2 2
t 1 t t 1
 t
2
Since   
t
2 2
t 1
  t t 1 
d  21  

   t
2 


t t 1
d  2(1   )
 t2
Since  1    1 implies 0  d  4
Durbin and Watson were successful deriving a lower bound dL and an upper bound dU
such that if thecomputed d lies outside these critical values, a decision canbe made
regarding the presence of positive or negative serial correlation
Compute the Durbin–Watson test (assuming that the assumptions underlying the test
are fulfilled)
1. Run the OLS regression and obtain the residuals
2. Compute d-statistics (Most computer programs now do this routinely)
3. For the given sample size and given number of explanatory variables, find out the
critical dL and dU values.
4. Now follow the decision rules of d-statistics
Decision rules:
These limits depend only on the number of observations n and the number of
explanatory variables and do not depend on the values taken by these explanatory
variables. These limits, for n going from 6 to 200 and up to 20 explanatory variables,
have been tabulated by Durbin and Watson
Example of reading Durbin Watson table:

Consider the wage-productivity regression model with estimated d value of 1.520 at
40 observations and one explanatory variable.
At the 5 percent level,
DL=1.44
DU=1.54
Due to indecisive zone, one cannot conclude that (first-order) autocorrelation does or
does not exist. However, it has beenfound that the upper limit dU is approximately the
true significance limit and therefore in case d lies in the indecisive zone, one can use
the following modified d test: Given the level of significance α,
1. H0: ρ = 0 versus H1:ρ>0. Reject H0 at α level if d<dU. That is, there is

statistically significant positive autocorrelation.
2. H0: ρ = 0 versus H1:ρ< 0. Reject H0 at α level if the estimated (4 − d) <dU, that
is, there is statistically significant evidence of negative autocorrelation.
3. H0: ρ = 0 versus H1: ρ ≠ 0. Reject H0 at 2α level if d <dU or (4 − d) <dU, that is,
there is statistically significant evidence of autocorrelation, positive or negative.
Test for higher-ordered autocorrelation: The LM test.
The LM (Lagrange Multiplier) statistic useful in identifying serial correlation not

only of the first order but also higher order as well.
CASE : CONSIDER AR(1) SCHEME
Yt  1   2 X t   t
t  t 1   t
Step (1): Estimate the regression model by OLS and compute its estimated residual
et
Step(2): Regress et against a constant X t1 ,...., X tk and et 1 using T  1
observations. Then the LM statistics can be calculated by T  1Re2

Where Re2 is the R-squared from the auxiliary regression
T  1 is used because the efficient number of observation is T  1

Step (3): Reject the null hypothesis of zero autocorrelation in favor of the alternation
that ρ ≠ 0
If T  1Re2  12,(1 ) the value of 12 in the chi-square distribution with 1

d.f such that the area to the right of it is 1 and  is the significant level.
CASE 2: CONSIDER AR(P) SCHEME
Yt  1   2 X t   t
t  t 1  ...  t  p   t
The null hypothesis is that each of  S is zero, against the alternative that at least one
of them is not zero.
Step (1): Estimate the original regression by OLS and obtain the residuals et
Step (2): Regress et against all the independence variables X t1 ,...., X tk plus
et 1 , et 2 ,.., et  p The effective number of observation used in the
auxiliary regression is T P
Step (3): Compute T  P Re2 From step 2 if exceeds  P2,(1 ) the value of the
chi-square distribution with P d.f then reject H 0 : 1  2  ...   P  0 in
favor of H1: at least one of the  S is significant different from zero

Remarks:
 The LM test does not have the inconclusiveness of the DW test.

 The LM test is a large-sample test and would need at least 30 to be
meaningful
 The repressors included in the regression model may contain lagged values
of the regressand Y , that is Yt 1 , Yt  2

 A drawback of the LM test is that the value of p, the length of the lag,
cannot be specified a priori. Some experimentation with the p value is
inevitable.
Remedial Measures
(1) Find out if the autocorrelation is pure autocorrelation and not the result of
mis-specification of the model
(2) If pure autocorrelation, one can use appropriate transformation of the
ordinal model so that in the transformed model we do not have the problem
of (pure) autocorrelation (use of generalized least square GLS method)
(3) In large sample, use the Newey-west method to obtain standard errors of
OLS estimation that are corrected for autocorrelation (Extension of whites
heteroscadasticity- consistent standard errors method
(4) In same situation use the OLS mention.
The method of generalized least square (GLS)
The remedy depends on the knowledge about the nature independence among the
disturbances
Consider the following model;
Yt  1   2 X t   t
t  t 1   t error term follow AR (1) scheme
1    1
Case 1:  is known
If Yt hold true at t , it also holds true at time t  1 hence;

Yt 1  1   2 X t 1  t 1
Multiplying  on both sides,
Yt 1  1  2 X t 1  t 1

Then
Yt  Yt 1  (1   )1   2 ( X t  X t 1 )   t
Where
 t  ( t  t 1 )
We can express as
Yt*  1*   2* X t*   t
Since the t error term satisfies the usual OLS assumptions, we apply OLS to the
transformed variables Yt * and X t* and obtain estimators with all the optimum
properties BLUE
In this differencing procedure we lose one observation because the first observation
has no antecedent. To avoid this loss of one observation, the first observation on Y and
X is transformed as follows:
Y1 1   2 and X1 1   2 . This transformation is known as the Prais–Winsten

transformation.
Note:
GLS is nothing but OLS applied to the transformed model that satisfies the classical
assumptions.
Case 2:  is not known
When  is not known, several methods can be used to estimate the value of 
namely; based on Durbin-Watson d statistics, from the relationship between d and
 can be used to estimate the value of  as follows:
  1
d
2
Estimated from residuals
If AR(1) scheme is valid, run the following regression:
t  t 1   t
The first-difference methods
Since  lies between 0 and ±1, one could start from extreme position.
  0 => No (first-order) serial correlation
  1 => Perfect positive or negative correlation

From:
Yt  Yt 1  (1   )1   2 ( X t  X t 1 )   t
If
 1
Yt  Yt 1   2 ( X t  X t 1 )  (ut  ut 1 )
This may be appropriate if the coefficient of autocorrelation is very high.
A rough rule of thumb: use the first difference from whenever d  R2
Use the g statistic to test the hypothesis that  1 (Berenblutt–Webb test)
n 
e t
2
g  2
n  ut et
u 2 Where, are OLS residuals from original regression, are
t
1
OLS residuals from the first-difference regression. Use Durbin-Watson table except
that now the null hypothesis is that

  1 . If g lies below the lower limit of d, we
do not reject the hypothesis that true

 1
An estimate of  is obtained and then used to estimate GLS. All this methods of
estimation are known as feasible GLS (FGLS) or estimated GLS (EGLS) methods.
The Newey-West method of correcting
Use of OLS but correct the standard error for autocorrelation
Extension of white’s heteroscedasticity-consistent standard errors
The corrected standard errors are known as HAC (heteroscedasticity and

autocorrelation-consistent) standard errors
Newey-west procedure is strictly valid in large sample, handles both
heteroscedasticity and autocorrelation problem
HAC standard errors are much greater than OLS standard errors. HAC t-ratios are
much smaller than OLS t-ratios.

Autocorrelation

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Autocorrelation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Autocorrelation

Uploaded by

Copyright:

Available Formats

Autocorrelation

Assumption about autocorrelation

correlation between any two i and j is zero => Cov  , 

However in some cases;

If this situation occurs then we have autocorrelation. Autocorrelation is often found in

MCosti  1   2Outputi   3Outputi2  i

but we fit the following regression function

Consumptiont  1   2 Incomet   3Consumptiont 1  t

Estimation in the levels versus first differences

Consider the following model: Yt  1   2 X t   t

at a given time, in previous time period ( t  1 ) model is presented by:

Where Yt 1 X t 1 and U t 1 known as lagged values of Y X and 

By combining the two models:

Yt   2 X t   t => Yt   2 X t   t

Y  1   2 X t 1  t 1 known as level form.

While, the equation Yt   2 X t   t known as (first) difference

Effect of AR(1) Error on OLS estimates

t Stochastic disturbance term

Under OLS assumption

(first-order because regress

If the model were

t  t 1  t 2   t AR(2)

t  t 1  t 2  t 3   t AR(3)

t  t 1  ...  t  p   t

Since for each

E (t )  0 for each t

Var (t )  Var (t 1 ) Retain assumption of homoscedasticity

Consider two variable regression model

OLS estimators of slope coefficient  2  

Under AR(1) scheme; variance of this estimator is given by

Note:   0 The two equations coincide.

Assuming regressor X also follows the first-order autoregressive scheme with

coefficient of autocorrelation r of then

Assume that r=0.6 and   0.8

d-statistics based on the estimated residuals which are computed in regression

Assumptions underlying the d-statistics are:-

1. The regression model includes the intercept term

2. Compute d-statistics (Most computer programs now do this routinely)

4. Now follow the decision rules of d-statistics

Example of reading Durbin Watson table:

At the 5 percent level,

1. H0: ρ = 0 versus H1:ρ>0. Reject H0 at α level if d<dU. That is, there is

The LM (Lagrange Multiplier) statistic useful in identifying serial correlation not

CASE : CONSIDER AR(1) SCHEME

Step(2): Regress et against a constant X t1 ,...., X tk and et 1 using T  1

observations. Then the LM statistics can be calculated by T  1Re2

T  1 is used because the efficient number of observation is T  1

If T  1Re2  12,(1 ) the value of 12 in the chi-square distribution with 1

et 1 , et 2 ,.., et  p The effective number of observation used in the

favor of H1: at least one of the  S is significant different from zero

 The LM test does not have the inconclusiveness of the DW test.

of the regressand Y , that is Yt 1 , Yt  2

The method of generalized least square (GLS)

Consider the following model;

t  t 1   t error term follow AR (1) scheme

If Yt hold true at t , it also holds true at time t  1 hence;

Multiplying  on both sides,

Yt 1  1  2 X t 1  t 1

Y1 1   2 and X1 1   2 . This transformation is known as the Prais–Winsten