This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Stat 104 Section 11 Josh Zagorsky < zagorsky@fas.harvard.edu > Concepts 1. Tests Stata Command Test For ovtest Ramsey RESET test for nonlinearity (if significant, add polynomial terms) hettest Heteroskedasticity swilk res Whether the residuals are normally distributed 2. Interaction terms x1*x2 The pvalue of the coefficient on x1*x2 tells you whether the interaction is significant. 3. Logistic Regression We dont use a simple linear probability model P ( Y = 1) = + 1 x 1 + ... + k x k because its heteroskedastic and noisy, and probabilities can be less than 0 or greater than 1. To predict a binary y variable, P ( Y = 1) = e + 1 x 1 + ... + k x k 1 + e + 1 x 1 + ... + k x k In Stata, logit y x1 x2 x3 HW #9 Comments: Adjusted R 2 is not a percentage The sign of association in a multiple regression is not the same thing as the sign of correlation (which is specifically the pairwise correlation in stats, which may not agree with the sign in multiple regression). Recall, the interpretation of a slope coefficient in multiple regression is that for a 1unit increase in X , we expect a bunit increase in Y , holding the other Xvariables in the model constant . if Stata reports a pvalue to be 0.000, write it up as pvalue < 0.001; probabilities cannot be zero! Practice Problems 1) In 1973, a paper was published in the journal Science about possible sex discrimination in admissions to the graduate schools of CalBerkeley. The authors reported that 46% of male applicants were admitted to the school, while only 30.4% of women were. We will look at these data to try to determine if there truly was discrimination in acceptance to Cals graduate schools. We will try to explain whether or not each applicant is accepted into the school (variable called admitted, 1 if the person was admitted , 0 if not) based on 2 predictors: sex (called female : 1 if female, 0 if male) and which school they applied to. Let's start with the twoway table of whether someone was admitted based on sex: . tabulate admitted female  female admitted  0 1  Total++ 0  1,399 1,278  2,677 1  1,191 557  1,748 ++ Total  2,590 1,835  4,425 a) Based on this table, is there any statistical evidence that sex is related to whether an applicant was admitted into Cal grad school or not? What test would you run? b) What proportion of women were admitted into the school? What proportion of men? What are the odds that a woman is admitted into the school? How about for a man? What is the odds ratio of women to men? Three logistic models are shown below (all the results will follow eventually). First the single predictor model using sex as an xvariable was fit: Model A . logit admitted female Logistic regression Number of obs = 4425 LR chi2(1) = 111.30 Prob > chi2 = 0.0000 Log likelihood = 2913.2788 Pseudo R2 = 0.0187...
View Full
Document
 Fall '11
 MichaelParzen

Click to edit the document details