This preview shows pages 1–8. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: STOR 155, Section 2 T uesday, Febr uar y 2, 2010 Section 2.4 Section 2.4: Cautions about correlation and regression • Residuals • Outliers and influential observations • Lurking variables • Correlations based on averaged data • Restrictedrange problem (omitted) Residuals In a regression, there are three numbers connected with every individual: x, the value of the explanatory variable, y, the observed value of the response variable, and ŷ , the predicted value of the response variable. The observed and predicted values y and ŷ are almost never equal. The difference y – ŷ is called the residual for that individual (or for that x ). residual for a value x in the dataset = y – ŷ. x = par ent s’ aver ag e height y = daughte r ’s height = ŷ pr edicte d daughte r ’s height r esid ual 63.5 60 61.90 1.90 67.0 66 65.27 + 0.73 65.5 65 63.83 + 1.17 69.5 66 67.68 1.68 67.5 67 65.75 + 1.25 65.5 63 63.82 0.82 70.0 69 68.16 + 0.84 63.0 63 61.42 + 1.58 63.0 61 61.42 0.42 67.5 65 65.75 0.75 Residuals: Example Parents’ heights as predictor of daughter’s height for n = 10 women. Regression line is For x = 63.5 : 61.90 = 0.69 + 0.96 63.5. Residual for first x is y – ŷ = 60 – 61.90 = −1.90. ŷ = 0.69 + 0.96 x For x = 63.5 : 61.90 = 0.69 + 0.96 63.5. Residual for first x is y – ŷ = 60 – 61.90 = −1.90. Residuals The residuals are the vertical distances that regression wants to minimize. Residuals The residuals appear as vertical distances on the plot of the regression line. Or, we can plot them by themselves. Why would we want to plot them by themselves? Residual plots r 2 is high and the fit looks good in the scatterplot. But the residual plot does a better job of showing that the fit is much better for values of x near the center than at the two ends....
View
Full
Document
This note was uploaded on 06/12/2010 for the course STOR 155 taught by Professor Andrewb.nobel during the Spring '08 term at UNC.
 Spring '08
 AndrewB.Nobel

Click to edit the document details