11 - STAT E-50 - Introduction to Statistics Inferences for...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
STAT E-50 - Introduction to Statistics Inferences for Regression The fitted regression line has the equation ˆ 01 y=b +bx . Now we can find confidence intervals and perform hypothesis tests for the slope. The idealized regression line is y01 μ = β + β x+ ε where ε is the error y - μ y for each data point (x, y). ε = “epsilon” 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Assumptions for the model and the errors: 1. Linearity Assumption Straight Enough Condition: does the scatterplot appear linear? Check the residuals to see if they appear to be randomly scattered 2. Independence Assumption: the errors must be mutually independent Randomization Condition: the individuals are a random sample Check the residuals for patterns, trends, clumping 2
Background image of page 2
3. Equal Variance Assumption: the variability of y should be about the same for all values of x Does The Plot Thicken? Condition: Is the spread about the line nearly constant in the scatterplot? Check the residuals for any patterns 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4. Normal Population Assumption: the errors follow a Normal model at each value of x Nearly Normal Condition: Look at a histogram or NPP of the residuals 4
Background image of page 4
If all assumptions are true, the idealized regression line will have a distribution of y- values for each x-value; these distributions will be normally distributed with equal variation and means along the regression line: 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Where do you start? 1. Create a scatterplot to see if the data is “straight enough”. y ˆ y ˆ 3. Draw a scatterplot of the residuals vs. x or ; this should have no pattern or bend, or thickening, or thinning, or outliers. 2. Fit a regression model and find the residuals (e) and predicted values ( ). 4. If the data is measured over time, plot the residuals vs. time to check for patterns that would suggest they are not independent. 5. If the scatterplot is “straight enough”, create a histogram and NPP of the residuals to check the “nearly normal” condition. 6. Continue with the inference if all conditions are reasonably satisfied. 6
Background image of page 6
Satisfactory plot Not a linear model 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Variance is not consistent 8
Background image of page 8
Plot showing a large residual Plot showing a potentially influential point 9
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Is cardiovascular fitness (as measured by time to exhaustion running on a treadmill) related to an athlete’s performance in a 20-km ski race? The data shown below, where x = treadmill run time to exhaustion (min.) y = 20-km ski time (min.) was taken from the article “Physiological Characteristics and Performance of Top U.S. Biathletes” ( Medicine and Science in Sports and Exercise (1995)). x 7.7 8.4 8.7 9.0 9.6 9.6 10.0 10.2 10.4 11.0 11.7 y 71.0 71.4 65.0 68.7 64.4 69.4 63.0 64.6 66.9 62.6 61.7 What do you want to know? What is the parameter you want to estimate? 10
Background image of page 10
Is cardiovascular fitness (as measured by time to exhaustion running on a treadmill) related to an athlete’s performance in a 20-km ski race? The data shown below, where x = treadmill run time to exhaustion (min.) y = 20-km ski time (min.) was taken from the article “Physiological Characteristics and Performance of Top U.S.
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 12
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 47

11 - STAT E-50 - Introduction to Statistics Inferences for...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online