lect4-2012

# lect4-2012 - Lecture 4 Stat102 Spring 2012 Chapter 3.1 3.2...

This preview shows pages 1–10. Sign up to view the full content.

1 Lecture 4 Stat102 - Spring 2012 • Chapter 3.1 – 3.2: – Introduction to regression analysis – Linear regression as a descriptive technique – The least-squares equations • Chapter 3.3 – Sampling distribution of b 0 , b 1 . – Continued in next lecture

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Regression Analysis Galton s classic data on heights of parents and their child (952 pairs) Describes the relationship between child s height (y) and the parents (mid)height (x). Predict the child s height given parents height. 61 63 65 67 69 71 73 75 child ht 63 64 65 66 67 68 69 70 71 72 73 74 parent ht Parent ht Child ht 73.60 72.22 72.69 67.72 72.85 70.46 71.68 65.13 70.62 61.20 70.23 63.10 70.74 64.96 70.73 66.43 69.47 63.10 68.26 62.00 65.88 61.31 64.90 61.36 64.80 61.95 64.21 64.96 And more
3 Uses of Regression Analysis • Description: Describe the relationship between a dependent variable y (child s height) and explanatory variables x (parents height). • Prediction: Predict dependent variable y based on explanatory variables x .

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 Model for Simple Regression • Consider a population of units on which the variables (y,x) are recorded. • Let denote the conditional mean of y given x. • The goal of regression analysis is to estimate . • Simple linear regression model: | yx x y | x x y 1 0 |
5 Simple Linear Regression Model • Model (more details later) y = dependent variable x = independent variable 0 = y-intercept 1 = slope of the line e = error (normally distributed) e x y 1 0 x 0 Run Rise 1 = Rise/Run 0 and 1 are unknown population parameters, therefore are estimated from the data. x y 01 x y x 

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6 Interpreting the Coefficients The slope is the change in the mean of y that is associated with a one unit change in x e.g.,for each extra inch for parents, the average heights of the child increases by 0.6 inch. The intercept is the estimated mean of y for x=0. However, when working from data this interpretation should only be used when the data contains observations with x near 0. Otherwise it is an extrapolation of the model, and so can be unreliable (Section 3.7.2). 1 61 63 65 67 69 71 73 75 child ht 63 64 65 66 67 68 69 70 71 72 73 74 parent ht child ht = 26.46 + 0.6 parent ht EXAMPLE
7 Least Squares Regression Line • What is a good estimate of the line? • A good estimated line should predict y well based on x. • We use – Least squares regression line: Line that minimizes the squared prediction errors in the sample. Good criterion and easy to compute. Has other nice mathematical properties.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
8 The Least Squares (Regression) Line ; Here is a scatterplot of data (n = 4) with two possible lines of fit: Y = X and Y = 2.5 (a horizontal line). Which is a better fit? 3 3 w w w w 4 1 1 4 (1,2) 2 2 (2,4) (3,1.5) Sum of squared differences = (2 - 1) 2 + (4 - 2) 2 + (1.5 - 3) 2 + (4,3.2) (3.2 - 4) 2 = 7.89 Sum of squared differences = (2 -2.5) 2 + (4 - 2.5) 2 + (1.5 - 2.5) 2 + (3.2 - 2.5) 2 = 3.99 2.5 The smaller the sum of squared differences the better the fit of the line to the data.
9 The Estimated Coefficients To calculate the estimates of the coefficients of the line that minimizes the

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 23

lect4-2012 - Lecture 4 Stat102 Spring 2012 Chapter 3.1 3.2...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online