lecture14

# lecture14 - ISYE2028 Spring 2009 Lecture 14 The Linear...

This preview shows pages 1–4. Sign up to view the full content.

ISYE2028 Spring 2009 Lecture 14 The Linear Model: OLS Regression Cont. Dr. Kobi Abayomi April 13, 2009 1 Covariance and The Correlation Coeﬃcient Recall the form of the sample variance (of a variable x ). s 2 = n i =1 ( x i - x ) 2 n - 1 I can write it in this equivalent form, for illustration s x 2 = n i =1 ( x i - x )( x i - x ) n - 1 to emphasize that we are taking a measurement of the distribution about the sample mean of variable x . For a variable y we would have: s y 2 = n i =1 ( y i - y )( y i - y ) n - 1 To extend this to two variables, x and y say. .. s xy 2 = n i =1 ( x i - x )( y i - y ) n - 1 Here we have a measurement of the distribution about the sample means of two variables at a time, or bivariate distribution. I’ll rewrite the equation with its more common name: 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
cov ( x,y ) = n i =1 ( x i - x )( y i - y ) n - 1 the sample covariance between x and y . It is worth highlighting the relationship between the sample covariance and the estimate of the slope parameter in the regression model ˆ β 1 = cov ( x,y ) s x 2 And as well that we can calculate cov ( x,y ) more easily (same as we did with the sample variance). cov ( x,y ) = 1 n - 1 [ n X i =1 x i y i - n i =1 x i n i =1 y i n ] The covariance describes the way two random variables jointly vary. If the two variables are independent, then they are uncorrelated, but not necessarily conversely. The covariance is just a generalization of the variance. The general rule for the covariance of linear combinations is then: cov ( a + bx,c + dy ) = bccov ( x,y ) . Often a scale free version of the covariance is used in its place. This ρ = corr ( x,y ) = cov ( x,y ) p σ x 2 σ y 2 is the correlation between two random variables x and y and its sample estimate is r = ˆ ρ = corr ( x,y ) = cov ( x,y ) p s x 2 s y 2 You remember the generalization of linear combinations of non-independent random variables. var ( a + bx + cy ) = b 2 var ( x ) + c 2 var ( y ) + 2 abcov ( x,y ) 2 Correlation and Regression The notions of linear correlation and regression are closely intertwined. Regression is a linear relationship based on correlation – the measure of the strength of the linear relationship. 2
What we do when we ﬁt a regression (i.e. estimate the parameters for a regression line; the slope and the intercept) is characterize the linear relationship by drawing a line — this is called ’ﬁtting’ a regression, or ﬁtting a regression line. A regression line is the ’best ﬁt’ line in the sense that it is the line which minimizes the diﬀerence between the observed values of the dependent variable and the predictions given the independent variable. We should never stop at just ﬁtting the line — very few relationships are perfect lines. We should always examine the ’ﬁt’ of the line. We do this by considering: Is the relationship is actually linear?

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 17

lecture14 - ISYE2028 Spring 2009 Lecture 14 The Linear...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online