Week5 - Chapter 7. Correlation and Simple Linear Regression...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 7. Correlation and Simple Linear Regression Section 7.1. Correlation Scatterplots Plot of Y vs X, two quantitative variables, measured on the same individual X=explanatory variable, Y=response variable Interpreting scatterplots Direction (positive, negative) 1 Strength (strong or moderate or weak ) Look for outliers. 2 Example: Interpret the following scatterplot. X=percent taking SAT Y=median SAT math score for each state Correlation A numerical measure of the strength and direction of linear association between two quantitative variables. Why? Because it is very hard to determine the strength of the linear association by eye. 3 Facts about Correlation coefficient(r) Correlation does not distinguish between explanatory and response variables. r is always between -1 and +1. Interpretation: positive/negative, strong /weak. r measures only the strength of the linear relationship between x and y. Outliers can have a strong effect on r. Correlation is a pure number, in other words, it is unitless. r=+1 means perfect positive linear relationship. r=-1 means perfect negative linear relationship. It measures the linear relationship between two quantitative variables. The correlation coefficient r=0 doesnt mean there is no relationship between two variables. It means there is no linear relationship between the variables. 4 CAUTION: Correlation measures the strength of the linear association between two variables. Therefore we can have two variables with a low correlation but with a strong association, which is not linear. Example: The variables X and Y are highly correlated and we can see a clear linear pattern. On the other hand, the correlation between the variables X and Z is small, but they also have a strong association. X Y 20 15 10 5 35 30 25 20 15 10 5 Scatterplot of Y vs X Correlations: X, Y Pearson correlation of X and Y = 1.000 5 X Z 20 15 10 5 100 80 60 40 20 Scatterplot of Z vs X Correlations: X, Z Pearson correlation of X and Z = 0.191 Examples: 6 Formula: Correlation Coefficient Suppose you have n pairs of data points as: (x 1, y 1 ),, (x n, y n ), Compute the sample means y x , and sample s.d.s s x and s y , where = =-- =-- = n i i y n i i x y y n s x x n s 1 2 1 2 ) ( 1 1 , ) ( 1 1 Compute z-scores: x i x s x x z- = , y i y s y y z- = Correlation coefficient: - -- = = y i x i n i s y y s x x n r 1 1 1 Note: You dont need to remember this formula. You can use your calculator (should have 2-variable functions) or Minitab ( for homework problems). Why does the formula of r give the same sign (negative/positive) as the graph?...
View Full Document

Page1 / 57

Week5 - Chapter 7. Correlation and Simple Linear Regression...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online