This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Correlation & Regression, III 9.07 4/6/2004 Review • Linear regression refers to fitting a best fit line y=a+bx to the bivariate data (x, y), where a = m y – bm x b = cov(x, y)/s x 2 = ss xy /ss xx • Correlation, r, is a measure of the strength and direction (positive vs. negative) of the relationship between x and y. r = cov(x, y)/(s x s y ) (There are various other computational formulas, too.) Outline • Relationship between correlation and regression, along with notes on the correlation coefficient • Effect size, and the meaning of r • Other kinds of correlation coefficients • Confidence intervals on the parameters of correlation and regression Relationship between r and regression • r = cov(x, y)/(s x s y ) 2 • In regression, the slope, b = cov(x, y)/s x • So we could also write b = r·(s y /s x ) • This means b = r when s x = s y 1 Notes on the correlation coefficient, r 1. The correlation coefficient is the slope (b) of the regression line when both the X and Y variables have been converted to zscores, i.e. when s x = s y = 1. Or more generally, when s x = s y . For a given s x and s y , the larger the size of the correlation coefficient, the steeper the slope. Invariance of r to linear transformations of x and y • A linear change in scale of either x or y will not change r. • E.G. converting height to meters and weight to kilograms will not change r. • This is just the sort of nice behavior we’d like from a measure of the strength of the relationship. – If you can predict height in inches from weight in lbs, you can just as well predict height in meters from weight in kilograms. Notes on the correlation coefficient, r 2. The correlation coefficient is invariant under linear transformations of x and/or y. • (r is the average of z x z y , and z x and z are y invariant to linear transformations of x and/or y) How do correlations (=r) and regression differ? • While in regression the emphasis is on predicting one variable from the other, in correlation the emphasis is on the degree to which a linear model may describe the relationship between two variables. • The regression equation depends upon which variable we choose as the explanatory variable, and which as the variable we wish to predict. • The correlation equation is symmetric with respect to x and y – switch them and r stays the same. 2 but regression is not y x x y )/s x 2 )/s y 2 )/(s x s y ) )/(s x s y ) x ↔ y To look out for, when calculating r: • • regression) – Correlation over a normal range 80 120 160 200 240 50 60 70 80 ( ) ( ) Correlation over a narrow range of heights 80 50 70 80 ( ) ( ) Correlation is symmetric wrt x & y, a = m – bm a = m – bm b = cov(x, y b = cov(x, y r = cov(x, y r = cov(x, y In regression, we had to watch out for outliers and extreme points, because they could have an undue influence on the results....
View
Full
Document
This note was uploaded on 11/11/2011 for the course BIO 9.07 taught by Professor Ruthrosenholtz during the Spring '04 term at MIT.
 Spring '04
 RuthRosenholtz

Click to edit the document details