{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

hw02 - Soc 63993 Advanced Social Statistics II Homework No...

Info icon This preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Homework #2 Multicollinearity/Missing Data Page 1 Soc 63993, Advanced Social Statistics II Homework No. 2 Multicollinearity/Missing Data I. Multicollinearity [The following problem is adapted from Greene, Econometric Analysis, Fourth Edition.] The data in longley.dta (available at http://www.nd.edu/~rwilliam/xsoc63993/index.html ) were collected by James W. Longley ( “An Appraisal of Least Squares Programs for the Electronic Computer from the point of view of the User,” Journal of the Ameri can Statistical Association, Vol. 62, No. 319 (Sep. 1967), pp. 819-841) for the purpose of assessing the accuracy of least squares computations by computer programs. (If you want to see how they did things before the advent of modern computers, the article is available on JSTOR in the statistics journals.) Economic data were collected for the US for each of the years 1947-1962. The variables are: Variable Description employ Number of people employed (in thousands). This is the dependent variable in the analysis price Gross National Product Implicit Price Deflator. This is an adjustment for inflation. It equals 100 in the base year, 1954. Because of inflation, it is higher in years after 1954, and lower in years before that. A value of 110 would mean that, in that particular year, it cost $110 to buy the same goods that cost $100 in 1954. gnp Gross National Product (in millions of dollars) armed Size of armed forces (in thousands) year Year the data are from A. Diagnosis . Analyze these data with Stata. First, give the commands . list . summarize just so you can get a feel for the characteristics of the data. Then give the command . regress employ price gnp armed year Then, do further examination to determine what evidence, if any, suggests that multicollinearity may or may not be present in these data. Estimate and examine the bivariate correlations, tolerances/VIFs, condition numbers, the sample size, and anything else that you think would help to diagnose a problem of multicollinearity if it existed. For everything you do, be sure to explain what it means and how it applies to multicollinearity; don’t just give numbers without explanation. If you find that multicollinearity is present, offer a substantive explanation for it, i.e. why are these variables so highly correlated with each other?
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon