This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 1. Solutions to Practice Questions for Midterm 2: Suppose you have the linear model Y, = b; + b2*X2,+ b3 *X3;+u;. a. If Vader) = k * X 22,, , where k is some constant, how can you transform the equation and perform the least squares procedure to get estimates of b2 and b3?
Describe all the steps. Divide the equation by JEXZS, so that the new equofion is: X
Y’ =b—2+b 1 +123 3" + u‘ JEXZJ J? ‘x/EXZJ JEXZJ «We The new equation is eomoskedoszie and we can peeform OLS regression on the
new transformed variables. The OLS estimated constant term oftke regression mempeed by J? gives a; b. If you decide to ignore the heteroskedasticity in a. above, are your OLS
estimates of b; and b3 still BLUE? Are the standard hypothesis tests still valid? Explain carefully. The estimates are still unbiased and consistent but no longer efﬁcient, so no they
are no longer BL UE. The estimoges offhe sided errors are inconsistent, so
therefore the hypothesis tests based on thosenstondard errors are no longer valid. c. Now you are not sure that there is heteroskedasticity in the model but you
suspect it is a linear function of the variables X2 and X3. How would you test
this hypothesis using the White’s test? Describe all the steps. For the White test we womb" run the regression equation as originofly specifiedm
Show is regress Y, on a constant, X3; and X31. Teen save the estimated residuois and square them. Estimate 6.! second regression eff = y + 61X” 4— 62X“ + 0,. Form {he .S'fCiE'iStiC HR? which wit? have a 2552) distribution under the null hypothesis. Consider the following model: yi = ﬂ] 4 [32%. +8,. where x,—= 0 or 1 is an indicator
variable. In order to estimate a more ﬂexible model, we ﬁt a 2nd order polynomial of x : y: 2 31 +ﬂ2xi + 193x? +5}
When we run the regression, Excel will not provide an estimate for [33 . Why? The mode! exhibiisperfec: multicolinearz’ay because when xi: 0 Men x3: 0 and simz’a’arb:
W??? X: 11 {he}? M21141“ 1. As a Pauli, five m3 X mairix has rank 2'3 and H90"! does no! 8x535. 21. Which model would have a higher R2? They words! be the same because the squared germ acids no irgformaﬁon. 2. Listed below is the output from two regressions. The data is from the California polls on wwwﬁvethirtyeigbtcom. The ﬁrst (on the left) is a regression of Obama’s margin on
time. The second (on the right), regresses the squaredresiduals from the ﬁrst regression on time. Use this output to calculate a White test for heteroskedasticity. What distribution does the statistic have under the null hypothesis of homoskedasticity‘? Would we reject in this case? SUMMARY OUTPUT
Regression Statistics
Multiple R 0.759498
R Square 0.516838
Adjusted R Square 0.52982
Standard Error 3.323393
Observations 11
ANOVA '
of 83 MS F 91' niﬁcance F Regression 1 179.3441 179.3441 12.26844 0.0066957
Residual 9 131.565 14.61833
Total 10 310.9091 Coefﬁcientstandard Em tStat P—vaiue Lower95%
Intercept 8679.567 2482.34 45.495822 0.006758 14296.142
date 0.218955 0.062512 3.502633 0.006696 0.0775441 SUMMARY OUTPUT R ression Statistics Muiiiple R 0.168072
R Square 0.028248
Adjusted R Square "0.079724
Standard Error 11.12087
Ohservaﬁons 1 1
ANOVA
01' 53 MS F i niﬁcance F Regression 1 32.3562 32.3562 0.261626 0.621318
Residual 9 1113.0 123.6737
Total 10 1145.419 Coeﬂi'cientshandard Em t Stat Pvalue Lower 95% U 395%
Intercept 3705.804 7221.684 0.51335 0.620205 1 2630.78 20042.39
date D.093002 0.131324 0.511493 0.621318 43.504315 0.318312 Here nR2=(JI)(028248):.310728 which is distributed as I?” under the Null hypothesis. The
7 standard 95% critical value is 3.84 so we fail to reject homoskedasticity in this case. ECON 140, Fall 2008 Alex Rothenberg Midterm #2 Solutions Problem 1 Substantial research documents a statistically signiﬁcant relationship between an indiridual’s
height, Hi, and the log of his or her wages, log W}. T his bivariate relationship could be estimated using OLS on a random sample of N individuals:
log W? = O! “I” + 52' (1) 1. Suppose the variable H7; is NOT objectively measured but is instead self reported. That is, we observe a noisyr measure of height, Hi; ~ HizHt‘l‘T/‘t where oz denotes some random measurement error. If shorter people are more likely than
taller people to overstate their height (but no one understates their height), will be an unbiased estimate of ,6? Explain. (5 points) 0 ANSWER: 3 will dearly not be an unbiased estimator ofﬁ. T a see this, we start
by rewriting the true model as fotlows: logI/Vim 05+,BHi—E—ei
=Q+ﬁ(E—‘U§)+€t
=d+BE+(ei—ﬁm)
=a+6ﬁ£+s From the usual formula for the especth value of our least squares estimator, we have: N
A + Cov(H , E) E6] x g VarU?) New, note that we have: CovUalf) = emf—is — ﬁe)
= Cov(§,s) — ,BCov(fI,o)
: Cov(H, e) + Cov(v, e) ——,6 CovUL 1)) —ﬁ Cov(v, v)
(1) (2) (3) (4) ECON 140, Fall 2908 Alex Rothenberg
So; we ’ve written the covariance between our ratsmeasured height and the error term in the regression we estimate as the sum of four covariance terms. By assumption,
(1 j and (2) are zero. The other covariance terms, however, are nonzero. Consider {3):
Cov(H, v) % 0 This is the covariance between the true height and the measurement error. Because
we assume that taller people are less likely to overstate their height than shorter people, this covariance is not equal to zero. Furthermore (4) is equal to:
Cov(v, v) = Var(v) E a5 % 0 So, going back to our expression for the expected value of 3, we have: n Cov H ,v + 0.3
em] : ,3 1— __(__2,—
VaI(H)
Since this is clearly not equal to ,6 { unless Cov(H,o) m ~03), our OLS estimator will be biased. 2. If is biased, will OLS overestimate or underestimate the relationship between H2 and log We“? Explain. (5 points) 0 ANSWER: Looking again at the bias formula we worked out in part 1: _ COV(H, v) + 03) = ,3 (1 Vang) The sign of the bias will clearly depend on whether or not the term in parentheses is
greater or less than 1. Because taller people are less likely to report their heights with
error than shorter people, we will have CovU—l, v) < 0. Whether this negative
covariance is large enough to counter the positive 03 and make the whole eapression
greater than 1 is unclear from the given information. Consider the following two
cases, illustrated by ﬁgures 1 and 2. The points represented by asterisks, *, represent
the true observations, while the points represented by circles asterisks are the measured data. ECON 140> Fall 2008 Alex Rothenberg In ﬁgure I, shorter people only lie a little bit about their height, and overestimates
,6. In ﬁgure 2, shorter people he ALOT about their age, and (which is negative)
underestimates ii. I think that the ﬁrst case is more likely, but we were generous
when grading this problem, and any well defended answer (one. that made
assumptions and argued conclusively from those assumptions) received full credit. 3. Now, let’s assume there is no measurement error for Hg. Can you think of any omitted variables that are likely to cause us to overestimate 01" underestimate the effect of height on wages? Explain.  ANSWER: We’ve clearly omitted LOTS of variables from this regression. equation
that might help to eaplain variation in log wages. However, we only get biased
estimates of {5' if the variables we omit are correlated with both the dependent
variable, wages, and the variable whose eﬁ‘ect we’re interested in estimating, height. Some good examples of omitted variables are: e weight: Obese people might get lower wages than shinnier people, and height
and weight are positively correlated.
— industry: Basketball players in the NBA typically earn more wages than non
players. Taller people play basketball.
To get full credit, you needed to explain how your omitted variable was correlated
with BOTH height and wages. You lost points if you missed eaplaining one ( or both ) of these correlations. ECON 140, Fall 2008 Alex Rothenberg Problem 3 Suppose a researcher provides a study of demand for Ferrari cabriolet cars in three large cities
of California, Los Angeles, San Francisco, and San Diego. She collects data from local car
dealers on. cars price p and demand quantities q (in. units per year). She thinks that demand
can depend on whether cars are selling in rich or poor neighborhoods, so she collects
demographic data and constructs variable RICH that is equal to 1 if dealer has shop is in rich
neighborhood and zero otherwise. Suppose she knows that average price of Ferrari is equal to
$100, 060 dollars, and that average g is equal to 100. She provides results of the following regression (standard errors are in parentheses): log(qt) = 160 ‘l‘ I31 + .62 + 8:; 100 “0.05 0.70
(15) (0.005) (0.25) i. Interpret the coefﬁcients [31 and ﬁg from the regression above. a ANSWER: To interpret ,6}, note that we can write the price elasticity of demand as follows:
d 103%)
d 10g(13i) So, [1’1 is the price elasticity of demand, so the estimation output tells as that the =51 8133 a}; 2 —0.05. In words, this means that an increase in the price of Ferrari cars by 1%
leads to a decrease in demand by 0.05%. To interpret 132; note that our model
implies: 130 + ['31 log(p¢:) if i is in a poor neighborhood
stages} m
50 + ,81 log(pt) + 62 if is in a rich neighborhood So, the diﬁerence in expected log quantity demanded in rich neighborhoods relative to
poor neighborhoods is given by: Ellog(qi)chh] — iE[log(qz)Poor] = 62 This implies that holding other things constant, demand for cars in rich
neighborhoods is 70% higher than in poor neighborhoods than in rich neighborhoods. 2. What is the elasticity of the demand curve? What is the slope of the demand curve (ie.
dq/dp) at the average price? ECON 140, Fall 2008 Alex Rothenberg o ANSWER: The elasticity of demand is given by .81, from the previous problem. To
derive the slope of the demand come, we note the following: a = expﬂog a)
I GXDCBO "t 181 loglpil + £2131 CHi "t 52') This implies that the slope is given by: d
3% = exp(ﬁo + 131108092”) + ﬁQRIOHi + 5i) = gt. (a)
Pa; 80, plugging in the average price and quantity, we get the following:
slope = "0.05 *100/10000 = ~0.00005. 3. To investigate whether demand varies across cities, our researcher decides to add three
variables to the regression as well as interact them with log p. They are:
a: LA : 1 if city is Los Angeles and 0 otherwise
0 SF = 1 if city is San Fifancisco and 0 otherwise 0 SD = 1 if city is San Diego and 0 otherwise The new regression is: 105%) = £0 + a logos) + 5215610291
+ 531314114“ ﬁ4SFi + 65SD'1‘:
+ ,65{LA3 >< log 301:) + ﬂ7(SFi x logpi) Jr ﬁng’Di >< logpg) + ea What can go wrong (if anything) when estimating the parameters of this regression?
‘What do you recommend she do in this case? 0 ANSWER: We have TWO multi~collinearity problems in this regression. Note the
following:
LAi + SE} + SDi :1 for all i So, we need to either drop the constant, [30, or one of the dummy variables. We also have:
(LAi >< logpi) + (SE; x log pi) + (813i >< logpi) = logpi ECON 140, Faii 2008 Alex Rothenberg So; we need to either drop one of the interaction terms; or the log pi; term, from the regression equation. 4. Provide a test that there is no difference in demand among these three cities. For full
credit, you should provide an appropriate test statistic and specify what regressions need
to be run and which regression reSults you would use to test this hypothesis (incinding a.
critical vaiue to look up). That is, directions your assistsint couid carry out to come to the correct conclusion. a ANSWER: There are several possible correct answers to this question, depending
on how you set up the problem. Here is one croniple of o correct answer. The unrestricted regression is: 1035((11') = [32191 C H; + [33L4i “i” [343E + ﬁSSDi
+ ﬁ6(LAi X10$Pd+ ﬁi(SFi X logo) + 68(3Di >< logpi) + ai This is the model from part 0 without the cot/linearity problems. The nuli hypothesis we. wish to test is the following.
Hoiﬁe=573ﬁeandﬁ2=ﬁs=i33
Iniposing the restrictions in the nuli hypothesis (q = 4), we get the restricted model:
tomes) = it; + ﬁ1lbg(oi)+ .6281 052 + si This is just the model from part a. We compute the Estotistic using the regression
resnits from, the restricted and unrestricted model ﬁts: E883 — ESSUR/4
: W N a? d H
F ESSUR/N _ 7 F4,N un er 0 ...
View
Full
Document
This note was uploaded on 02/17/2010 for the course ECON 141 taught by Professor Staff during the Fall '08 term at Berkeley.
 Fall '08
 Staff

Click to edit the document details