# The histogram for the residuals from this equation

• Homework Help
• 249
• 96% (25) 24 out of 25 people found this document helpful

This preview shows page 48 - 51 out of 249 pages.

The histogram for the residuals from this equation, with the best-fitting normal distributionoverlaid, is given below:
This edition is intended for use outside of the U.S. only, with content that may be different from the U.S. Edition. This may not be resold, copied,or distributed without the prior consent of the publisher.44Fractionuhat-2-101.50.03.06.1.14(iii)The residuals from the log(wage) regression appear to be more normally distributed.Certainly the histogram in part (ii) fits under its comparable normal density better than in part (i),and the histogram for thewageresiduals is notably skewed to the left.In thewageregressionthere are some very large residuals (roughly equal to 15) that lie almost five estimated standarddeviations (ˆσ= 3.085) from the mean of the residuals, which is identically zero, of course.Residuals far from zero does not appear to be nearly as much of a problem in the log(wage)regression.C5.2(i)The regression with all 4,137 observations isncolgpa= 1.392.01352hsperc+ .00148sat(0.072)(.00055)(.00007)n= 4,137,R2= .273.(ii)Using only the first 2,070 observations gives
This edition is intended for use outside of the U.S. only, with content that may be different from the U.S. Edition. This may not be resold, copied,or distributed without the prior consent of the publisher.45ncolgpa=1.436.01275hsperc+.00147sat(0.098)(.00072)(.00009)n= 2,070,R2= .283.(iii)The ratio of the standard error using 2,070 observations to that using 4,137 observationsis about 1.31.From (5.10) we compute(4,137 / 2,070)1.41, which is somewhat above theratio of the actual standard errors.C5.3We first run the regressioncolgpaoncigs,parity, andfamincusing only the 1,191observations with nonmissing observations onmotheducandfatheduc.After obtaining theseresiduals,iu±, these are regressed oncigsi,parityi,faminci,motheduci, andfatheduci, where, ofcourse, we can only use the 1,197 observations with nonmissing values for bothmotheducandfatheduc.TheR-squared from this regression,2uR, is about .0024.With 1,191 observations, thechi-square statistic is (1,191)(.0024)2.86.Thep-value from the22χdistribution is about.239, which is very close to .242, thep-value for the comparableFtest.C5.4(i) The measure of skewness forincis about 1.86. When we use log(inc), the skewnessmeasure is about .360. Therefore, there is much less skewness in log of income, which meansincis less likely to be normally distributed. (In fact, the skewness in income distributions is a well-documented fact across many countries and time periods.)(ii) The skewness forbwghtis about.60. When we use log(bwght), the skewness measureis about2.95. In this case, there is much more skewness after taking the natural log.(iii) The example in part (ii) clearly shows that this statement cannot hold generally. It ispossible to introduce skewness by taking the natural log. As an empirical matter, for manyeconomic variables, particularly dollar values, taking the log often does help to reduce oreliminate skewness. But it does nothaveto.

Course Hero member to access this document

Course Hero member to access this document

End of preview. Want to read all 249 pages?