section11_handout_stat104_josh

section11_handout_stat104_josh - Stat 104 Section 11 Josh...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Stat 104 Section 11 Josh Zagorsky < zagorsky@fas.harvard.edu > Concepts 1. Tests Stata Command Test For ovtest Ramsey RESET test for nonlinearity (if significant, add polynomial terms) hettest Heteroskedasticity swilk res Whether the residuals are normally distributed 2. Interaction terms x1*x2 The p-value of the coefficient on x1*x2 tells you whether the interaction is significant. 3. Logistic Regression We dont use a simple linear probability model P ( Y = 1) = + 1 x 1 + ... + k x k because its heteroskedastic and noisy, and probabilities can be less than 0 or greater than 1. To predict a binary y variable, P ( Y = 1) = e + 1 x 1 + ... + k x k 1 + e + 1 x 1 + ... + k x k In Stata, logit y x1 x2 x3 HW #9 Comments:- Adjusted R 2 is not a percentage- The sign of association in a multiple regression is not the same thing as the sign of correlation (which is specifically the pairwise correlation in stats, which may not agree with the sign in multiple regression).- Recall, the interpretation of a slope coefficient in multiple regression is that for a 1-unit increase in X , we expect a b-unit increase in Y , holding the other X-variables in the model constant .- if Stata reports a p-value to be 0.000, write it up as p-value < 0.001; probabilities cannot be zero! Practice Problems 1) In 1973, a paper was published in the journal Science about possible sex discrimination in admissions to the graduate schools of Cal-Berkeley. The authors reported that 46% of male applicants were admitted to the school, while only 30.4% of women were. We will look at these data to try to determine if there truly was discrimination in acceptance to Cals graduate schools. We will try to explain whether or not each applicant is accepted into the school (variable called admitted, 1 if the person was admitted , 0 if not) based on 2 predictors: sex (called female : 1 if female, 0 if male) and which school they applied to. Let's start with the two-way table of whether someone was admitted based on sex: . tabulate admitted female | female admitted | 0 1 | Total-----------+----------------------+---------- 0 | 1,399 1,278 | 2,677 1 | 1,191 557 | 1,748 -----------+----------------------+---------- Total | 2,590 1,835 | 4,425 a) Based on this table, is there any statistical evidence that sex is related to whether an applicant was admitted into Cal grad school or not? What test would you run? b) What proportion of women were admitted into the school? What proportion of men? What are the odds that a woman is admitted into the school? How about for a man? What is the odds ratio of women to men? Three logistic models are shown below (all the results will follow eventually). First the single predictor model using sex as an x-variable was fit: Model A . logit admitted female Logistic regression Number of obs = 4425 LR chi2(1) = 111.30 Prob > chi2 = 0.0000 Log likelihood = -2913.2788 Pseudo R2 = 0.0187------------------------------------------------------------------------------...
View Full Document

Page1 / 7

section11_handout_stat104_josh - Stat 104 Section 11 Josh...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online