Chapter2.45

# Chapter2.45 - 2.4 Cautions about Regression and Correlation...

2.4 Cautions about Regression and Correlation Key Words in Section 2.4 Residuals Lurking variables and influential observation Plots of the residuals , which are the differences between the observed and predicted values of the response variable, are very useful for examing the fit of a regression line. Features to look out for in a residual plot are unusually large values of the residuals (outliers), nonlinear patterns, and uneven variation about the horizontal line through zero (corresponding to uneven variation about the regression line). Residuals A residual is the difference between an observed value of the response variable and the value predicted by the regression line. That is, Residual = observed y – predicted y

y y ˆ - = Example When x= 24 months in Table 2.7, the observed mean height of Kalama children was 79.9 cm. The least regression line is 17 . 80 ) 24 635 . 0 ( 93 . 64 ˆ = × + = y cm Residual = observed y – predicted y 27 . 0 17 . 80 9 . 79 ˆ - = - = - = y y cm
Figure (a) The Kalama growth data with the least-squares line. (b) Plot of the residuals from the regression in (a) against the explanatory variable. Residual Plots A residual plot is a scatterplot of the regression residuals against the explanatory variable. Residual plots help us assess the fit of a regression line.

Figure Simplified patterns in plots of least- squares residuals. Lurking Variable

A lurking variable is a variable that is not among the explanatory or response variables in a study and yet may influence the interpretation of relationships among those variables. The effects of lurking variables , variables other than the explanatory variable which may also affect the response, can often be seen by plotting the residuals versus such variables.
