Assignment III Due at 5:30pm of April 24 (Monday) This assignment is related to Chapter 6 (problems 1, 2 and 3), 7 (problems 4 and 5) and 8 (problems 6, 7 and 8). You can use any formula taught in class without proof. If you want to use any result from other courses, please provide proof or justiÃcation. Note: (i) You need to print out and turn in your STATA code for all empirical exercises. (ii) All STATA outputs are reported in the standard format as required in Assignment II, Problem 5(i). 1. (10 points) Read "Beta CoeÂ¢ cients" of Section 6-1a before answering this question. For a random variable X, deÃne the z-score of X as XX bX , where X is the sample mean and bX is the sample deviation of X. In the simple linear regression, y = 0 + 1x + u; E [ujx] = 0; we Ãrst calculate the z-scores of y and x, denoted as ye and xe, and then run the regression of ye on 1 and xe. Show that the resulting slope estimator is the sample correlation of x and y. 2. (10 points) Problem 2 of Chapter 6 in the textbook. Note: Use the method suggested in the hint, which is diÂ§erent from what was taught in class. 3. (10 points) Computer Exercise C2 of Chapter 6 in the textbook. 4. (10 points) Problem 5 of Chapter 7 in the textbook. 5. (20 points) Use the data in SLEEP75.dta for this exercise. The equation of interest is sleep = 0 + 1 totwrk + 2 educ + 3age + 4age2 + 5yngkid + u; where sleep and totwrk (total work) are measured in minutes per week, educ and age are measured in years, and yngkid is a dummy variable for the presence of children less than 3 years old. (i) Estimate this equation separately for men and women and report the results in the standard format. Are there notable diÂ§erences in the two estimated equations? (ii) Compute the Chow test for equality of the parameters in the sleep equation for men and women. Use the form of the test that adds male and the interaction terms maletotwrk, . . . , male yngkid and uses the full set of observations. What are the relevant df for the test? Should you reject the null at the 5% level? 1 (iii) Now, allow for a diÂ§erent intercept for males and females and determine whether the interaction terms involving male are jointly signiÃcant. (iv) Given the results from parts (ii) and (iii), what would be your Ãnal model? 6. (5 points) In the simple linear regression, suppose yi = 0 + 1xi + ui ; E [ui jxi ] = 0; V ar (ui jxi) = 2 i ; where 2 i is known. Derive the formulae of the WLS estimator of (0 ; 1 ). 7. (15 points) Suppose lbwght = 7:96 :0023 cigs + :0121 npvis :00024 npvis2 (0:05)(:0012) (:0037) (:00012) [0:05] [:0012] [:0051] [:00014] :00098 mage + :0022 f age :0014 meduc + :0027 feduc (:0015) (:0012) (:0030) (:0027) [:0016] [:0012] [:0028] [:0027] n = 1624; R2 = :0194 where lbwght is the log of the birth weight, cigs is the number of cigarettes, npvis is the number of prenatal visits, mage is motherÃs age, f agem is fatherÃs age, meduc is motherÃs education, and feduc is fatherÃs education. The usual standard errors are in parentheses and the heteroskedasticity-robust standard errors are in brackets. (i) Interpret the coeÂ¢ cient on cigs. Does the 95% conÃdence interval for cigs depend on which standard error you use? (ii) Comment on the statistical signiÃcance of npvis2 , using both the usual and heteroskedasticityrobust standard errors. (iii) If the four age and education terms are dropped from the regression (and the same set of observations is used), the R2 becomes :0162. Develop the homoskedasticity-only test of H0: mage = 0, fage = 0, meduc = 0, feduc = 0. 8. (20 points) Use the data in R&D_Sales_ProÃts.dta for this exercise. We want to explain the research and development (R&D) expenditures incurred by 18 industries. All data are in million of US dollars. Consider the following regressions: Model (1): R&Di = 0 + 1 P rof itsi + ui 2 Model (2) : Variables in log ln R&Di = 0 + 1 ln P rof itsi + ui (i) Estimate both regressions and present your results in the standard format. (ii) Using a graphical method, do you detect any evidence of heteroskedasticity in both regressions? What does this suggest about the log transformation? (iii) Verify your qualitative conclusion in part (ii) by the White test. (iv) Considering only Model (1), if there is evidence of heteroskedasticity, (a) Name and apply a method to obtain robust standard errors, without changing eÂ¢ - ciency. Compare with linear model results in (i). (b) Name and apply a method to obtain eÂ¢ cient estimators. Explain how you proceed and your choices. Compare results with (a). 9. (Bonus) In this problem, we try to settle down some issues that were asked by some students or unsolved in class. (i) (10 points) In slide 22 of Chapter 6, if we run a regression of y on 1; x1; x2 and (x1 x1) (x1 x2), then what is the relationship between b0;b1;b2 and the original regression coeÂ¢ cients b 0 ; b 1 ; b 2 ? Justify your answer. (ii) (10 points) In slide 36 of Chapter 6, we calculate Re2 by Ãrst obtaining ybi = mb ib0, where mb i = expn b 0 + b 1xi1 + + b kxiko and b0 = n 1 Pn i=1 exp (ubi), and then calculating Corr [ (y; yb) 2 . An alternative goodness-of-Ãt measure is Pn i=1(ybiyb) 2 Pn i=1(yiy) 2 . We know that these two measures are equivalent in linear regression. Show that these two measures are not equivalent in this scenario. (Hint: show that the second measure is invariant to the value of b0 while the Ãrst measure is not.)