113
Lecture 29  Things to be careful about when doing regression analysis
1)
Doing what theory suggests.
We have seen that the inclusion of irrelevant variables will cause the variance of estimators to increase and
hence lower the value of their tstatistics.
Thus, when a researcher encounters low tstatistics, he or she
may be tempted to simply leave out the variables concluding that they were irrelevant.
But note that there are other reasons for obtaining low t values than the inclusion of irrelevant variables.
For instance, we could have multicollinearity, which means the independent variables are correlated with
one another.
We will study multicollinearity in a few lectures from now.
As we have seen, leaving out a
variable, even one with a low tvalue, will caused bias in the remaining coefficients if the independent
variable left out is correlated with other independent variables in the regression equation.
The example on
page 183 of your textbook is an excellent illustration of this point.
Suppose we let:
C
=
quantity of Brazilian coffee demanded
P
P
Y
P
bc
= the price of
Brazilian coffee
cc
= the price of Colombian coffee
d
= income of U.S. citizens.
t
= price of Tea
Suppose we estimate and obtain the following regression results:
$
Y
dt
.
.
.
.
.
(
$
)
. , . ,.
. , , .
. ,
C
P
P
SE B
t
R
n
t
bc
tt
=
+
+
+
=
=
=
=
91
7 8
2 4
0035
156 12 001
5 2 35
6
25
2
Because the coefficient on the price of Brazilian coffee has an insignificant tvalue (.5) the researcher
decides to drop the variable, believing that coffee demand is inelastic with respect to price.
So he/she
drops the variable and reestimates the equation obtaining:
$
.
Y
dt
.
.
.
(
$
)
,.
. ,
.
.
,
C
P
SE B
t
R
n
t
t
=
+
+
=
=
=
=
9 3
2 6
00036
1 0009
2 6 4
61
25
2
Was it a good idea to drop the variable?
Let us see how it fits in with our 4 criteria for judging regression
results.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
114
1) Theory
 it is possible that the demand for coffee could be price inelastic (though price not mattering at
all is a bit unlikely!).
2) ttest
 the tvalue is insignificant
3) RBAR SQUARED
 RBAR SQUARED does increase when the variable is dropped, indicating that the
variable is irrelevant.
(Since the tvalue <1, this is to be expected.
Recall that RBAR SQUARED
penalizes you when you add a variable with low explanatory power, in practice this translates into an
independent variable with a tvalue <1.
If the tvalue is > 1, RBAR SQUARED will increase when the
independent variable is added even if the variable is statistically insignificant.).
4) Bias
 the remaining coefficients change only a small amount when the price of Brazilian coffee is
dropped, suggesting that there is little if any bias caused by excluding the variable.
However, this is a case of poorly thoughtout theory.
There is no variable in either of the above equations
for the price of competitive coffee, such as Colombian coffee.
Theory would always suggest that
competitive goods be included in the specification of the model.
And, if the price of Brazilian coffee and
the price of Colombian coffee are correlated, as they almost certainly are, then leaving out the price of
Colombian coffee will have caused bias in the coefficient for Brazilian coffee in the above 2 equations.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '11
 YongJinPark
 Econometrics, Regression Analysis, researcher

Click to edit the document details