153
Lecture 38 - Serial correlation
1) Serial correlation occurs most frequently in time series data.
Serial correlation implies that order matters (a
positive error follows a positive error, or a negative error follows a positive, etc.), since it implies that an error in one
time period depends on the error from another time, i.e., the error term has a systematic component.
There are two categories of serial correlation
.
a) The first is known as pure serial correlation
.
Pure serial correlation occurs when the model is correctly specified,
yet the error term has a systematic component.
So, pure serial correlation occurs when the systematic component is
due purely to the error term.
In other words, there are no omitted variables whose effect is being captured in the
random error component.
When serial correlation does occur we know that
, violating our classical
assumptions.
Er
tt
εε
−
≠
1
0
di
To say that an error term has a systematic component means that it follows a pattern of some type.
An infinite
number of patterns are possible, but what is known as first order serial correlation is the most common pattern
assumed by econometricians. First order serial correlation
means that the error term in one period depends upon the
error term in the previous period. We write this in the following form:
ερ
.
ε
v
=+
−
1
t
Here
is the correlation coefficient between
ε
and
, so it measures the degree to which the error term in one
period depends upon the error term in the previous period.
Since it is a correlation coefficient, we know that it takes
on a value between -1 and + 1.
ρ
t
ε
t
−
1
When
ρ
is positive, we say that there is positive serial correlation
.
This means that,
on average, a positive error in one time-period will be followed by a positive error term in the succeeding period,
and a negative error term will on average be followed by a negative. If
ρ
is <0, we say there is negative serial
correlation
.
This means that, on average, a positive error term in one period will be followed by a negative error
term in the following period.
If
is equal to 0, then there is no first order serial correlation.
The equation above
says that the error term in time t depends upon two elements: the error term in the previous period, and a purely
random component,
v
.
Thus,
is assumed to follow all the classical assumptions, i.e., it does not contain any
serial correlation.
This is important to remember, as we will see.
Finally, the closer
is to either -1 or +1, the
greater the degree of serial correlation.
ρ
v
t
t
ρ
b) Impure serial correlation
.
Impure serial correlation is caused by specification error-usually an omitted variable.
So, the best cure for this type of serial correlation is to include the omitted variable.
Of course, we may not know
what this is!
If there is an omitted variable, then we have the following:
YBB
X B
X
+
+
01
1 2
2
ε
t
t
t
t
is the true regression model. Suppose, however, that
is omitted from the
estimated equation.
Then the model we actually estimate is:
X
2
X
+
1
ε
*
, where
.