ch16correlationcoefficient

ch16correlationcoefficient - Chapter 16: Chapter...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 16: Chapter Correlation Coefficient Correlation A. Construction of Measure of Reliability ( x1 , y1 ), ( x2 , y2 ),........., ( xn , yn ) Given data points , we can always find the regression line ˆ y = a + bx where and ∑ x y − nx y a = y − bx b= ∑ x − nx i i 2 i 2 and use it to predict y from x. Is this prediction reliable? Say, in case A, where the experimental data lie more or less along a straight line, the answer is possibly “yes”. In case B, below, where the experimental In data are scattered around, we might think that the prediction may not be reliable. that Question: How can we tell whether the data will lead to reliable predictions or not? i.e. whether the scatter diagram is of Case A or Case B or something else? Answer: We shall construct a measure of linear-link measure between x and y. For this consider Case A: r= (∑ xi2 − nx 2 )(∑ yi2 − ny 2 ) ∑x y i i − nx y iis called the correlation s coefficient between x and y . coefficient between It is unit less. It This “r” is the measure of linear This link we have been seeking. link Example 1: Find r for the data of Example 3 of Ch15: Ch15: (15,0.4), (50,3.1), (35,1.2), (90,4.3) Solution: ON MODE COMP MODE REG Lin 15 0.4 DT,......, 90 4.3 DT , , REG Output: SHIFT S­VAR VAR gives 0.9611 = r r Output: 4=n 590 = ∑ xi yi using the formula 47.5 = x 12050 = ∑ xi2 2.25 = y 29.7 = ∑ yi2 r= = (∑ xi2 − nx 2 )(∑ yi2 − ny 2 ) 590 − 4 × 47.5 × 2.25 (12050 − 4 × 47.52 )(29.7 − 4 × 2.252 ) ∑x y i i − nx y = 0.9611 Is this value of r large or small? How do I interpret r? B. Theorem B. Proof: -1 ≤ r ≤ 1 C. Interpretation of r: C. If r=0.9394 r=­0.9413 r=0.7086 r=0.5354 r=±0.1314 r=0 r=±1 We say that x and y are: Strongly and positively correlated Strongly but negatively correlated Moderately and positively correlated Moderately and positively correlated weakly and ± vely correlated uncorrelated Strongly and ± vely correlated Note 1: The correlation between x and y is the same as the correlation between y and x. Note 2: In a study of correlation, take n to be as large as possible (usually n ≥ 20). Otherwise the value r cannot reflect the true correlation too well. For example, in the above calculation (example 1), although r = 0.9611 is quite high, one should not be too excited. This is because n is only 4. If, in the extreme case, n=2, then r must be +1 or -1 (as two points n=2 then must lie on a straight line). This is an exaggeration. Therefore, the larger the n, the better. Therefore, However, in applying the above However, interpretation, one must also note the following points: the Example 2: Suppose x =ice­cream consumption i.e. total amount of ice­cream consumed by the inhabitants of a city, and y=number of people drowned at the beach Then, it is likely that the correlation between x and y is +vely high. But that does not mean that eating ice­cream causes a person to drown. In fact, they just happen to move together because of the weather. Note 3: Correlation means co–relation. It only indicates how x and y move together. x and y need NOT have a cause-effect relationship. relationship. Example 4: However, the high correlation between smoking and lung­ cancer led to the conclusion that smoking causes cancer. The high correlation was noticed long ago. But the cause­effect relationship was only proved in 1996. Since then, cigarette commercials have been completely banned. Example 3: Brothers’ heights are also highly and +vely correlated. But, one is not the cause and the other is not the effect. and In fact, both the heights are the effect of In inheritance from their parents. inheritance D. Use of LR-Mode in 50F ˆ y = a + bx Example 5: Find and r for the following data: Hence predict y when x=50 Solution: ON MODE COMP MODE REG Lin , 15 4.7 DT , ; 54 3.3 SHIFT 2 DT ………………. , ( −) 95 0.5 DT REG Output: n 8=n SHIFT S­SUM EXE 23383 = ∑ x ∑x SHIFT S­SUM EXE 878.9 = ∑ xi yi SHIFT S­SUM EXE ∑ xy 2 2 i 49.375 = x x SHIFT S­VAR VAR EXE 2.7875 = y SHIFT S­VAR VAR EXE y a SHIFT S­VAR VAR EXE b SHIFT S­VAR VAR EXE 2 5.6147 = a − 0.05726 = b 2 i 77.37 = ∑ y ∑y SHIFT S­SUM EXE r SHIFT S­VAR VAR EXE − 0.91457 = r ˆ y 50 SHIFT S­VAR VAR EXE ˆ 2.7517 = y Hence ...
View Full Document

Ask a homework question - tutors are online