This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Stat 104, Section 9 Handout Benny, Thursday, 4pm, SC309 Key Concepts Inference for Simple Regression o R 2 o s (std. Dev, of residuals) o ttest for a regression slope (and related conf. interval) o Confidence and Prediction intervals at a particular x* o ttest for correlation (equivalent to ttest for slope) Multiple Regression o Omnibus Ftest, ttest for a specific j o Interpretation of coefficients o Adjusted R 2 Practice Problems 1. For this problem, we will be doing a regression analysis on whether we think a nations per capita GDP (in American dollars) is associated with that countrys literacy rate. In fact, we will look at log(GDP) vs. literacy rate (in percentage points) for 172 different countries (found on www.wikipedia.com). Why log(GDP)? Because it is more normally distributed and makes the fit better. 2 3 4 5 20 40 60 80 100 Literacy logGDP Fitted values a) Does the above show a good fit? What else should we look at? The regression line follows the general trend of the data but there are several data points with large residuals, suggesting the not all of the variances in the data can be explained by the model. We should also look at R 2 to see whether this is a good fit. b) What is the correlation between literacy rate and logGDP? Is this relationship significant? . regress loggdp literacy Source  SS df MS Number of obs = 171+ F( 1, 169) = 133.88 Model  37.9164244 1 37.9164244 Prob > F = 0.0000 Residual  47.8639022 169 .283218356 Rsquared = 0.4420+ Adj Rsquared = 0.4387 Total  85.7803266 170 .504590157 Root MSE = .53218 loggdp  Coef. Std. Err. t P>t [95% Conf. Interval]+ literacy  .023024 .0019899 11.57 0.000 .0190957 .0269522 _cons  1.58357 .1673684 9.46 0.000 1.253168 1.913972 The correlation is the or .6648. To test the significance of this value, I could look up R critical value for that corresponds to my df and significance level (~.15). Alternatively, we can look at the f statistic which tells us how our model compares to a model with no variable. Here the p value of the f statistic is 0, which means we reject the null hypothesis and conclude that the relationship is significant. Lastly, we can do our standard ttest of the coefficient of the literacy variable, and note that t = 11.57 is very significant. c) Using the regression above, what is the predicted logGDP for a country with 90% literacy rate? How about the predicted GDP?...
View
Full
Document
This note was uploaded on 03/27/2012 for the course STATS 104 taught by Professor Michaelparzen during the Fall '11 term at Harvard.
 Fall '11
 MichaelParzen

Click to edit the document details