Chapter19 - levelsofanotherva

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
70.0 67.5 65.0 62.5 60.0 57.5 55.0 200 180 160 140 120 100 Length (cm) Weight (g) Scatterplot of Weight vs Length Female Vipera bertis Simple Linear Regression We have been introduced to the notion that a categorical variable could depend on different levels of another variable when we discussed contingency tables. We’ll extend this idea to the case of predicting a continuous response variable from different levels of another variable. We say the variable, Y, is the response variable dependent on an explanatory predictor variable, X. There are many examples in the life sciences of such a situation – height to predict weight, dose of an algaecide to predict algae growth, skin fold measurements to predict total body fat, etc… Often times, several predictors are used to make a prediction of one variable (ex. height, weight, age, smoking status, gender can all be used to predict blood pressure). We focus on the special case of using one predictor variable for a response, where the relationship is linear. Example 12.3 In a study of a free living population of the snake Vipera bertis , researchers caught and measured nine adult females. The goal is to predict weight (Y) from length (X). The data and a scatterplot of the data are below. Notice this data comes in pairs. For example (x 1 ,y 1 ) = (60, 136) First, we look at a scatterplot of the data. We’d like to fit a (straight) line to the data. Why linear? Does fitting a (straight line) seem reasonable? Snake Length (cm) Weight (g) 1 60 136 2 69 198 3 66 194 4 64 140 5 54 93 6 67 172 7 59 116 8 65 174 9 63 145
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Regression and Correlation Page 2 Simple Linear Model (Regression Equation) The simple linear model relating Y and X is Y = b O + b 1 X b O is the intercept, the point where the line crosses the Y axis b 1 is the slope, the change in Y over the change in X (rise over run) Definition: A predicted value (or fitted value) is the predicted value of y i for a given x i based on the regression equation, b O + b 1 x i Notation: y i = b O + b 1 x i A residual is the departure from Y of a fitted value. Notation: resid i = y i y i Which line do we fit? We will fit a line that goes through the data in the best way possible, based on the least squares criterion. Definition: The residual sum of squares (a.k.a. SS(resid) or SSE) is SSሺresidሻ ൌ SSE ൌ ෍ ሺy i ‐y i n iൌ1 2 The least squares criterion states that the optimal fit of a model to data occurs when the SS(resid) is as small as possible. Note that under our model SSሺresidሻ ൌ SSE ൌ ෍ ሺy i ‐y i n iൌ1 2 ൌ ෍ ሺy i ‐ሺb O ൅b 1 x i n iൌ1 ሻሻ 2 Refer to the applet at http://standards.nctm.org/document/eexamples/chap7/7.4/
Background image of page 2
Regression and Correlation Page 3 70.0 67.5 65.0 62.5 60.0 57.5 55.0 200 180 160 140 120 100 Length (cm) Weight (g) Scatterplot of Weight vs Length with Fitted Regression Line Using calculus to minimize the SSE, we find the coefficients for the regression equation.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 16

Chapter19 - levelsofanotherva

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online