section9sol

section9sol - Stat 104, Section 9 Handout Benny, Thursday,...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Stat 104, Section 9 Handout Benny, Thursday, 4pm, SC309 Key Concepts Inference for Simple Regression o R 2 o s (std. Dev, of residuals) o t-test for a regression slope (and related conf. interval) o Confidence and Prediction intervals at a particular x* o t-test for correlation (equivalent to t-test for slope) Multiple Regression o Omnibus F-test, t-test for a specific j o Interpretation of coefficients o Adjusted R 2 Practice Problems 1. For this problem, we will be doing a regression analysis on whether we think a nations per capita GDP (in American dollars) is associated with that countrys literacy rate. In fact, we will look at log(GDP) vs. literacy rate (in percentage points) for 172 different countries (found on www.wikipedia.com). Why log(GDP)? Because it is more normally distributed and makes the fit better. 2 3 4 5 20 40 60 80 100 Literacy logGDP Fitted values a) Does the above show a good fit? What else should we look at? The regression line follows the general trend of the data but there are several data points with large residuals, suggesting the not all of the variances in the data can be explained by the model. We should also look at R 2 to see whether this is a good fit. b) What is the correlation between literacy rate and logGDP? Is this relationship significant? . regress loggdp literacy Source | SS df MS Number of obs = 171-------------+------------------------------ F( 1, 169) = 133.88 Model | 37.9164244 1 37.9164244 Prob > F = 0.0000 Residual | 47.8639022 169 .283218356 R-squared = 0.4420-------------+------------------------------ Adj R-squared = 0.4387 Total | 85.7803266 170 .504590157 Root MSE = .53218------------------------------------------------------------------------------ loggdp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- literacy | .023024 .0019899 11.57 0.000 .0190957 .0269522 _cons | 1.58357 .1673684 9.46 0.000 1.253168 1.913972------------------------------------------------------------------------------ The correlation is the or .6648. To test the significance of this value, I could look up R critical value for that corresponds to my df and significance level (~.15). Alternatively, we can look at the f statistic which tells us how our model compares to a model with no variable. Here the p value of the f statistic is 0, which means we reject the null hypothesis and conclude that the relationship is significant. Lastly, we can do our standard t-test of the coefficient of the literacy variable, and note that t = 11.57 is very significant. c) Using the regression above, what is the predicted logGDP for a country with 90% literacy rate? How about the predicted GDP?...
View Full Document

This note was uploaded on 03/27/2012 for the course STATS 104 taught by Professor Michaelparzen during the Fall '11 term at Harvard.

Page1 / 6

section9sol - Stat 104, Section 9 Handout Benny, Thursday,...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online