This preview shows pages 1–3. Sign up to view the full content.
Department of Economics
W3412
Columbia University
Spring 2010
SOLUTIONS TO
Problem Set 3
Introduction to Econometrics
Prof. Marcelo J. Moreira and Seyhan E Arkonac, PhD
for all sections
Spring 2010
1.
The following question is a continuation of problem set 2.
The data set for this
problem,
growth.dta
, is described at the end of this problem.
Using STATA, compute the sample mean and standard deviation of growth and
tradeshr.
a)
Estimate a regression of growth on tradeshr, using the “robust” option. Graph the
data points and the estimated regression line, does the regression error appear to be
homoskedastic or heteroskedastic?
It is unfortunately rather hard to tell.
One could argue that the spread is greater for moderate
values of
tradeshr
than for low or high values, but there is no clear pattern.
On the other hand
one would not want to conclude that the residuals are homoskedastic.
b)
Run the regression again without the “robust” option.
Compare the results to what
you obtained with the “robust” option.
What is different?
. reg growth tradeshr;
Source 
SS
df
MS
Number of obs =
65
+
F(
1,
63) =
8.89
Model 
28.4885066
1
28.4885066
Prob > F
=
0.0041
Residual 
201.851551
63
3.20399287
Rsquared
=
0.1237
+
Adj Rsquared =
0.1098
Total 
230.340057
64
3.5990634
Root MSE
=
1.79

growth 
Coef.
Std. Err.
t
P>t
[95% Conf. Interval]
+
tradeshr 
2.306434
.773485
2.98
0.004
.7607473
3.85212
_cons 
.6402653
.4899767
1.31
0.196
.3388749
1.619405

The standard errors,
t
statistics,
p
values, and confidence intervals are different, but the OLS
estimates,
R
2
, and
RMSE
are not. The heteroskedasticityrobust standard error of .66 is
substantially smaller than the homoskedasticityonly standard errors of .77, so with this large a
difference one should rely on the heteroskedasticityrobust standard errors.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Documentc)
You should see an outlier in the data set.
Rerun the regression (with the “robust”
option), dropping the outlier.
Does dropping the outlier make a qualitative difference
to your results?
Explain.
. reg growth tradeshr if tradeshr<1.5, r;
Linear regression
Number of obs =
64
F(
1,
62) =
3.77
Prob > F
=
0.0567
Rsquared
=
0.0447
Root MSE
=
1.7894


Robust
growth 
Coef.
Std. Err.
t
P>t
[95% Conf. Interval]
+
tradeshr 
1.680905
.8656171
1.94
0.057
.0494392
3.411249
_cons 
.9574107
.5360579
1.79
0.079
.1141537
2.028975

Dropping the outlier makes a big difference! The slope falls considerably, and the coefficient is
no longer significant at the 5% level!
2.
For the following questions, use data set
CPS04.dta.
Each month the Bureau of Labor
Statistics in the U.S. Department of Labor conducts the “Current Population Survey” (CPS),
which provides data on labor force characteristics of the population, including the level of
employment, unemployment, and earnings. Approximately 65,000 randomly selected U.S.
households are surveyed each month. The sample is chosen by randomly selecting addresses
from a database comprised of addresses from the most recent decennial census augmented
with data on new housing units constructed after the last census. The exact random sampling
scheme is rather complicated (first small geographical areas are randomly selected, then
housing units within these areas randomly selected); details can be found in the Handbook of
Labor Statistics and is described on the Bureau of Labor Statistics website
(www.bls.gov
). The survey conducted each March is more detailed than in other months and
This is the end of the preview. Sign up
to
access the rest of the document.
 Fall '11
 SeyhanArkonac
 Econometrics

Click to edit the document details