This preview shows pages 1–5. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: + + Chapter 13: The Correlation Coefficient and the Regression Line We begin with a some useful facts about straight lines. Recall the x , y coordinate system, as pictured below. y = 2 . 5 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 a8 y = 0 . 5 x a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 a64 y = 1 x 3 2 1 1 2 3 3 2 1 1 2 3 + 337 + + We say that y is a linear function of x if y = a + bx, for some numbers a and b. If y = a + bx then the graph of the function is a straight line with yintercept equal to a and slope equal to b . The line is horizontal if, and only if, b = 0; o.w. it slopes up if b > and slopes down if b < 0. The only lines not covered by the above are the vertical lines, e.g. x = 6. Vertical lines are not interesting in Statistics. In math class we learn that lines extend for ever. In statistical applications, as we will see, they never extend forever. This distinction is very important. In fact, it would be more accurate to say that statisticians study line segments, not lines, but everybody says lines. It will be very important for you to under stand lines in two ways, what I call visually and analytically . + 338 + + Here is what I mean. Consider the line y = 5+2 x . We will want to substitute (plug in) values for x to learn what we get for y . For example, x = 3. We do this analytically by substituting in the equation: y = 5 + 2(3) = 11. But we can also do this visually, by graphing the function. Walk along the x axis until we get to x = 3 and then climb up a rope (slide down a pole) until we hit the line. Our height when we hit the line is y = 11. (Draw picture on board.) The Scatterplot We are interested in situations in which we obtain two numbers per subject. For exam ple, if the subjects are college students, the numbers could be: X = height and Y = weight. X = score on ACT and Y = first year GPA. + 339 + + X = number of AP credits and Y = first year GPA. Law schools are interested in: X = LSAT score and Y = first year law school GPA. and so on. In each of these examples, the Y is considered more important by the researcher and is called the response . The X is impor tant b/c its value might help us understand Y better and it is called the predictor . For some studies, reasonable people can dis agree on which variable to call Y . Here are two examples: The subjects are married couples and the variables are: wifes IQ and husbands IQ. The subjects are identical twins and the vari ables are: first borns IQ and second borns IQ. We study two big topics in Chapter 13. For the first of these, the correlation coefficient, it does not matter which variable is called Y ....
View
Full
Document
This note was uploaded on 10/23/2009 for the course STAT STATS 301 taught by Professor Professorwardrop during the Fall '08 term at Wisconsin.
 Fall '08
 ProfessorWardrop
 Statistics, Correlation, Correlation Coefficient

Click to edit the document details