ch17morecomputationaltricksforregressionandcorrelation

ch17morecomputationaltricksforregressionandcorrelation -...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 17: Chapter More Computational Tricks for Regression and Correlation Regression Example 1: The data below represent the lengths (x) and The breadths (y) of five cuckoos’ eggs, measured in mm. mm. x 22.3 24.2 20.8 25.9 23.5 y 16.5 17.8 10.4 19.3 15.8 (a) Fit a regression line Find r. Find ˆ y = a + bx to these data. Solution: Solution ON MODE COMP MODE REG Lin REG , 16.5 DT ,………...,23.5 , 15.8 DT 22.3 SHIFT S-SUM n EXE 5 = n SHIFT S-VAR VAR SHIFT S-VAR VAR SHIFT S-VAR VAR a EXE b EXE r EXE − 21.2344 = a 1.5936 = b 0.9078 = r ˆ ∴ y = a + bx = −21.2344 + 1.5936 x r = 0.9078 (b) It was later found that the pair (20.8,10.4) was incorrectly recorded. It should have been (21.8, 14.0). Modify the R.L. and r found in (a). Modify ∇∇∇ SHIFT SHIFT SHIFT SHIFT ∇∇∇ S-VAR S-VAR S-VAR VAR VAR VAR ∇ 21.8 EXE ∇ 14 EXE AC a EXE b EXE r EXE − 9.6923 = a 1.1203 = b 0.9076 = r ˆ ∴ y = a + bx = −9.6923 + 1.1203 x r = 0.9076 (a) (a) Find the R.L. of x on y. (i.e. x = c + dy ) Find (i.e. ˆ for the corrected data. for Also find r. Is the correlation the same as (b)? Solution: Method 1: Simply reverse x and y and find the regression line as usual. y x 16.5 17.8 14.0 19.3 15.8 22.3 24.2 21.8 25.9 23.5 ˆ ∴ x = c + dy = 11.2754 + 0.7353 y r = 0.9076 Note the following relationship: Note ∑ x y − nx y b= ∑ x − nx i i 2 i 2 ∑ x y − nx y d= ∑ y − ny i i 2 i 2 r= (∑ xi2 − nx 2 )(∑ yi2 − ny 2 ) ∑x y i i − nx y ∴r = 2 (∑ x − nx )(∑ y − ny ) 2 i 2 2 i 2 (∑ xi yi − nx y ) 2 = bd Method 2: Continue the above program Method (i.e. r = 0.9076 is showing on the calculator screen) r2 bd = r ⇒ d = b 2 and x = c + dy ⇒ c = x − dy b y EXE we have the following steps: we x2 ( −) ÷ Ans SHIFT S-VAR VAR SHIFT EXE 0.7353 = d × SHIFT S-VAR VAR + SHIFT S-VAR VAR SHIFT x 11.2754 = c Therefore, and and ˆ x = c + dy = 11.2754 + 0.7353 y r = 0.9076 There is NO change in the value of the correlation. There ˆ ˆ (d) Find y at x = 21.5 using y = a + bx of (b); (d) ˆ Find x at y = 18 using x = c + dy of (c). ˆ Solution: For the first part, we continue the above program: For 21.5 SHIFT S-VAR VAR ˆ ˆ y EXE 14.3946 = y ˆ ∴ y = 14.3946 For the second part, we need direct computation: ˆ ∴ x = 11.2754 + 0.7353 × 18 = 24.5108 Example 2: Example Results of 40 students who took a numerical test and an aptitude test are shown below. Find y = a + bx and Find ˆ Aptitude Test y 601-650 551-600 501-550 451-500 401-450 1 2 2 r test 2 5 8 2 x 2 5 1 Nume rical 1 2 4 3 21-40 41-60 61-80 81-100 Solution: Solution In terms of class marks, the data are as below: below Aptitude Test y 625.5 575.5 525.5 475.5 425.5 1 2 2 Nume rical 30.5 50.5 1 2 4 3 test 70.5 2 5 8 2 x 90.5 2 5 1 Aptitude Test y 625.5 575.5 525.5 475.5 Nume rical 30.5 50.5 1 2 1 2 2 4 3 test 70.5 2 5 8 2 x 90.5 2 5 1 Solution: Solution ON MODE 30.5 425.5 , 525.5 DT, , , , 625.5 DT, , 575.5 SHIFT ; 2 DT, , 525.5 SHIFT ; 50.5 50.5 DT, , ; 475.5 SHIFT 3 DT…………………………………., 475.5 , , , ; ; COMP 90.5 90.5 DT DT 625.5 SHIFT 2 DT, 575.5 SHIFT 5 DT, MODE REG Lin REG 475.5 SHIFT ; 2 DT, 425.5 SHIFT ; 2 DT, 4 525.5 SHIFT SHIFT SHIFT SHIFT SHIFT S-SUM S-VAR S-VAR S-VAR 40 = n n EXE VAR a VAR b VAR r EXE EXE EXE 430.03 = a 1.6933 = b 0.5991 = r ˆ ∴ y = a + bx = 430.03 + 1.6933 x r = 0.5991 Example 3: Example Construct (any number of ) paired data so that the Construct correlation coefficient between x and y is (a) 0.4 (a) (b) -0.35 Solution: Trick: Assign frequency 1+r, 1-r, etc. to the points etc. (1,1) , (-1, 1), etc. as in the table below. x -1 1 x y f 1 1 1+r -1 1 1-r -1 -1 1+r 1 -1 1-r y 1 -1 1-r 1+r 2 1+r 1-r 2 2 2 4 Then x = 0 , y = 0 and we have: Then Correlation = (∑ xi2 − nx 2 )(∑ yi2 − ny 2 ) ∑x y i i − nx y = 4r − 0 =r 4−0⋅ 4−0 (a) For r =0.4, we construct the data set as below: =0.4, x y f 1 1 1.4 -1 1 0.6 -1 -1 1.4 1 -1 0.6 x y 1 -1 0.6 1.4 1.4 0.6 -1 1 This yields r = 0.4 This 0.4 (b) For r = -0.35, we construct the data set as below: below: x 1 -1 -1 1 y f 1 1 -1 -1 1.35 x This yields r = -0.35 This -0.35 0.65 1.35 0.65 -1 1.35 0.65 1 0.65 1.35 y 1 -1 Note: The above method only takes care of the r value. If we The want to take care of other quantities such as s1 and s2 , we need more tricks. See Ex 4. need Example 4: Example Two samples from the same experiment give the Two following results: following Item n x 30 32 y s1 s2 r 0.7 0.8 Sample 1 13 Sample 2 10 125 130 3 4 6 5 The two samples are now combined. The Find the R.L. of y on x and the correlation, based on these 23 pairs of data. these Solution: Solution Let us create a frequency table for Sample 1: create y \x 125+6 125 125-6 y\x 131 125 119 13 − 1 (1 + 0.7) 4 30-3 13 − 1 (1 − 0.7) 4 30 0 30+3 13 − 1 (1 + 0.7) 4 Table I 0 27 0.9 0 5.1 1 0 30 0 1 0 0 33 5.1 0 0.9 Table II Solution: To verify that the above method of creation Solution To works, you may use REG-Mode on Table II: works, y\x 131 27 0.9 30 0 33 5.1 125 0 1 0 119 5.1 0 0.9 ON MODE COMP MODE REG Lin REG ON 27 27 30 33 , , , 131 SHIFT ; 0.9 DT, 125 DT 131 SHIFT ; 5.1 DT, , , 119 SHIFT ; 5.1 DT, 119 SHIFT ; 0.9 DT Then press SHIFT S-SUM Then check the values in Table II. check and SHIFT S-VAR VAR to Solution continued: Solution Similarly, we may create a frequency table for create Sample 2: Sample y \x 120+5 120 120-5 10 − 1 (1 + 0.8) 4 32-4 10 − 1 (1 − 0.8) 4 32 0 32+4 10 − 1 (1 + 0.8) 4 y\x 125 28 0.45 0 4.05 32 0 1 0 36 4.05 0 0.45 0 Table III Table 1 0 10 − 1 (1 − 0.8) 4 0 120 115 Table IV Then apply REG-Mode to both Tables II & IV taken together. Then The results are: The ˆ ∴ y = 95.9956 + 0.8692 x and and r = 0.5098 General formula: General y \x y + sY ( x − sX n −1 )(1 − r ) 4 x 0 1 0 ( x + sX ( n −1 )(1 + r ) 4 y y − sY ( 0 n −1 )(1 + r ) 4 0 n −1 )(1 − r ) 4 Simple Exercises: Simple Q1. Find (a) x y f ˆ y = a + bx and r from these data: 76.5 6.8 3 94.3 120.7 64.3 5.1 1 3.2 2 8.4 3 70.2 7.5 1 Q1. Find Q1. (b) ˆ y = a + bx and r from these data: x 0-5 y 20-40 40-60 60-80 80-100 3 2 5-10 10-15 15-20 20-25 4 4 2 2 1 8 5 1 9 2 Q2. Two samples from the same experiment yield the results below: experiment (a) Combine the two samples to obtain a R.L of y on Combine ˆ x. (i.e. y = a + bx ) (b) What is the correlation based on the 30 pairs? What Statistics Sample 1 Sample 2 n 20 10 x 15 16 y s1 s2 r 0.78 0.85 2.7 2.1 1.5 1.8 2.2 2.0 ...
View Full Document

This note was uploaded on 02/22/2010 for the course FBE STAT0302 taught by Professor Unknown during the Spring '10 term at HKU.

Ask a homework question - tutors are online