{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Week5 - Chapter 7 Correlation and Simple Linear Regression...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
Chapter 7. Correlation and Simple Linear Regression Section 7.1. Correlation Scatterplots Plot of Y vs X, two quantitative variables, measured on the same individual X=explanatory variable, Y=response variable Interpreting scatterplots Direction (positive, negative) 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Strength (strong or moderate or weak ) Look for outliers. 2
Background image of page 2
Example: Interpret the following scatterplot. X=percent taking SAT Y=median SAT math score for each state Correlation A numerical measure of the strength and direction of linear association between two quantitative variables. Why? Because it is very hard to determine the strength of the linear association by eye. 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Facts about Correlation coefficient(r) Correlation does not distinguish between explanatory and response variables. r is always between -1 and +1. Interpretation: positive/negative, strong /weak. r measures only the strength of the linear relationship between x and y. Outliers can have a strong effect on r. Correlation is a “pure number,” in other words, it is unitless. r=+1 means perfect positive linear relationship. r=-1 means perfect negative linear relationship. It measures the linear relationship between two quantitative variables. The correlation coefficient r=0 doesn’t mean there is no relationship between two variables. It means there is no linear relationship between the variables. 4
Background image of page 4
CAUTION: Correlation measures the strength of the linear association between two variables. Therefore we can have two variables with a “low” correlation but with a strong association, which is not linear. Example: The variables X and Y are highly correlated and we can see a clear linear pattern. On the other hand, the correlation between the variables X and Z is small, but they also have a strong association. X Y 20 15 10 5 0 35 30 25 20 15 10 5 0 Scatterplot of Y vs X Correlations: X, Y Pearson correlation of X and Y = 1.000 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
X Z 20 15 10 5 0 100 80 60 40 20 0 Scatterplot of Z vs X Correlations: X, Z Pearson correlation of X and Z = 0.191 Examples: 6
Background image of page 6
Formula: Correlation Coefficient Suppose you have n pairs of data points as: (x 1, y 1 ),…, (x n, y n ), Compute the sample means y x , and sample s.d.s s x and s y , where = = - - = - - = n i i y n i i x y y n s x x n s 1 2 1 2 ) ( 1 1 , ) ( 1 1 Compute z-scores: x i x s x x z - = , y i y s y y z - = Correlation coefficient: - - - = = y i x i n i s y y s x x n r 1 1 1 Note: You don’t need to remember this formula. You can use your calculator (should have 2-variable functions) or Minitab ( for homework problems). Why does the formula of r give the same sign (negative/positive) as the graph? 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Section 7.2. Least Squares Regression A method for finding the "best-fitting" line through a set of (x, y) points.
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}