Department of Economics
Fall 2011
University of California
Prof. Woroch
Economics 140:
Problem Set 2
ANSWER SHEET
True/False/Uncertain and Explain.
Below are two statements that may be true or false or possibly
ambiguous.
State which one you believe to be the case, and more importantly, give a detailed explanation
for your answer.
Supply additional assumptions if you think they are necessary for your answer.
1
“If a 5% significance level is used to test the null that the slope coefficient of a large sample bivariate
regression is zero, then the probability that the t-statistic is larger than 1.96 is 2.5% if the null is true.”
True.
As a result of the CLT, under the null hypothesis (that is, when the true slope coefficient is
zero), the slope coefficient has an asymptotic normal distribution, and so the test statistic has an
asymptotic N(0,1) distribution under H
0
. Therefore, Pr(t-stat>1.96)=1-
!
(1.96)=2.5%.
2
“When a researcher is trying to estimate the causal effect of X on Y, and finds that the R
2
of her
bivariate regression model is around 0.04, she should conclude that her estimates are not precise
enough to confirm a causal relationship.”
Uncertain/Cannot decide from these.
A researcher interested in the causal effect of X on Y is
interested in the magnitude of the slope coefficient: if the slope coefficient is significantly different
from 0 (that is, we can reject the null that
"
1
=0) and
the assumptions of OLS are satisfied
(particularly, E(u
i
|X
i
)=0)
, X has a significant causal effect on Y, while if we cannot reject H0:
"
1
=0,
we have no evidence that X causally affects Y. If the slope coefficient is significant but E(u
i
|X
i
)
#
0, we
cannot interpret the effect as being causal (it is only a correlation).
On the other hand, R
2
measures the goodness-of-fit of the model, that is, the proportion of variation in
Y explained by the variation in X. It is usually of interest when we are not after the causal effect of X
on Y but want to do prediction with our econometric model instead. If R
2
is small, we are predictions
are poor, while if R
2
is large, we do a good job of prediction.
As an example, if we look at a regression of testscores on the student-teacher ratio, the slope
coefficient may be significantly different from 0. Now, if students were randomly assigned to
classrooms with different student-teacher ratios (which would ensure that E(u
i
|X
i
)=0), we could say
that a 1 unit decrease in student-teacher ratio
causes
test scores to increase by the estimated slope
coefficient. However, if we wanted to use this very simple model to predict average test scores in
California schools that are not in this sample, and we would do poorly because by itself, student-
teacher ratio explains only 5% in the variation of test scores (as the R
2
is 0.051).
[On the other hand, of we run a regression of the number of births at the settlement level on the