Unformatted text preview: Sociology 63993—Exam 1—Page 1 Sociology 63993 Exam 1 Answer Key February 18, 2011 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A data set contains a few extreme outliers. It is usually best to use Stata’s rreg (Robust Regression) routine to deal with the problem. False. Indeed, this may be one of the worst options. Check the coding first, consider adding new vars to the model, try running the analysis with and without the outlier, or try some other robust regression technique (e.g. qreg ). 2. The independent variables in an analysis include X1, X2, and X1X2 (i.e. X1 * X2). X1 has missing data (and hence X1X2 does too). If multiple imputation is being used, you should first compute X1X2, and then impute the missing values for X1 and X1X2. True. Passive imputation, where you impute X1 first and then compute X1X2, may seem more intuitive to some. But, as Allison and others note, it can bias correlations toward zero. [Note: I think I was more definitive about this in class than I was in the notes, so I will show a little leeway when grading if you show you understand the issues and concepts.] 3. Cronbach’s Alpha is used to test for serial correlation. False. Cronbach’s Alpha assesses the reliability of a scale. The Durbin-Watson statistic can be used for serial correlation. 4. The less true variability there is in a population, the higher the reliability of measures will tend to be. False. Reliability = True Variance/ Total Variance, so the higher the true variability, the higher the reliability tends to be. 5. The most extreme outliers on Y (i.e. the cases where Y is furthest from the mean) will always have the most influence on the regression line. False. Influence = discrepancy * leverage. A highly discrepant case can still have little or no influence on the regression line if its X values are at or near the means of X. II. Short answer. Discuss all three of the following problems. (15 points each, 45 points total.) In each case, the researcher has used Stata to test for a possible problem, concluded that there is a problem, and then adopted a strategy to address that problem. Explain (a) what problem the researcher was testing for, and why she concluded that there was a problem, (b) the rationale behind the solution she chose, i.e. how does it try to address the problem, and (c) one alternative solution she could have tried, and why. (NOTE: a few sentences on each point will probably suffice – you don’t have to repeat everything that was in the lecture notes.) Sociology 63993—Exam 1—Page 2 II-1. . sum income white male age fathered Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- income | 812 16.96983 8.464258 .5 25 white | 812 .864532 .3424337 0 1 male | 812 .4864532 .5001245 0 1 age | 812 38.53695 11.92651 18 81 fathered | 695 11.44173 3.838113 0 20 . fre fathered fathered -- HIGHEST YEAR SCHOOL COMPLETED, FATHER...
