This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Stat 350 Solution to Assignment 10 11. (a) SSxy = 5530.92  (1950)(47.92)/18 = 339.586667, SSxx = 251,970  (1950)2/18 = 40,720,and SSyy = 130.6074  (47.92)2/18 = 3.033711, so r= = .9662 There is a very strong positive correlation between the two variables. (b) Because the association between the variables is positive, the specimen with the larger shear force will tend to have a larger percent dry fiber weight. (c) Changing the units of measurement on either (or both) variables will have no effect on the calculated value of r, because any change in units will affect both the numerator and denominator of r by exactly the same multiplicative constant. 13. Most people acquire a license as soon as they become eligible. If for, example, the minimum age for obtaining a license is 16, then the time since acquiring a license, y, is usually related to age by the equation y x 16, which is the equation of a straight line. In other words, the majority of people in a sample will have y values that closely follow the line y = x16. Using a correlation coefficient to summarize the relationship between the artist (x) and the sales price (y) is not appropriate. To compute and interpret a correlation coefficient both x and y variables must be quantitative variables. While the y variable, sale price, is quantitative, the x variable, artist, is not. Let d0 denote the (fixed) length of the stretch of highway. Then, d0 = distance = (rate)(time) = xy. Dividing both sides by x, gives the equation y = d0/x which means the relationship between x and y is curvilinear (in particular, the curve is a hyperbola). However, for values of x that are fairly close to one another, sections of this hyperbola can be approximated very well by a straight line with a negative slope (to see this, draw a picture of the function d0/x for a particular value of d0). This means that r should be closer to .9 than to any of the other choices. The value of the sample correlation coefficient using the squared y values would not necessarily be approximately 1. If the yvalues are greater than 1, then the squared yvalues would differ from each other by more than the yvalues differ from one another. Hence, the relationship between x and y2 would be less like a straight line, and the resulting value of the correlation coefficient would decrease. [Note: I have yet to find an example where r is less than about .96 for (x, y2), however.] 14. 15. 16. 18. (a) To obtain the least squares regression equation, first we will compute the slope and the vertical intercept using the equations provided in Section 3.3. Thus, the equation for the least squares line is: (b) The corresponding residual is: (c) Yes, there is a very large residual corresponding to the (37, 9) pair. The predicted value equals 24.76 and since the observed value is 9, the residual equals (9 24.76) = 15.76. (d) We need to compute r2. Thus, 95.6% of the observed variation in removal can be explained by the approximate linear relationship between removal and loading, a very impressive result. (e) After deleting these two pairs of observations, the new least squares line becomes: So, you can see the large effect these two pairs of observations had on the analysis. The estimate of the slope was decreased and the fit of the least squares line, as measured by r2, is not nearly as good. ...
View
Full
Document
This note was uploaded on 02/06/2012 for the course STAT 350 taught by Professor Staff during the Spring '08 term at Purdue UniversityWest Lafayette.
 Spring '08
 Staff
 Statistics, Correlation

Click to edit the document details