This preview shows pages 1–4. Sign up to view the full content.
Let's get started! Here is what you will learn in this lesson.
Learning objectives for this lesson
Upon completion of this lesson, you should be:
•
Understand the relationship between the slope of the regression line
and correlation
•
Comprehend the meaning of the Coefficient of Determination, R
2
•
Now how to determine which variable is a response and which is an
explanatory in a regression equation
•
Understand that correlation measures the strength of a
linear
relationship between two variables
•
Realize how outliers can influence a regression equation
•
Be able to reasonably estimate the correlation from a scatterplot
Correlation and Regression
Correlation and regression is concerned with examining the relationship between two
(or more) quantitative variables. Three tools will be used to describe, picture, and
quantify the relationship between quantitative variables:
1.
Scatterplot
, a twodimensional graph of data values for two
quantitative variables.
2.
Correlation
, a statistic that measures the
strength
and
direction
of a
linear relationship
between two quantitative variables.
3.
Regression equation,
an equation that describes the average
relationship between a quantitative response variable and a quantitative
explanatory variable.
Equations of Straight Lines: Review
The equation of a straight line is given by y = a + bx. When x = 0, y = a, the
intercept
of the line; b is the slope of the line: it measures the
change in y per unit change in
x.
Two examples:
Data 1
Data 2
x
y
x
y
0
3
0
13
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document 5
1
11
2
7
2
9
3
9
3
7
4
11
4
5
5
13
5
3
For the 'Data 1' the equation is y = 3 + 2x ; the intercept is 3 and the slope is 2. The
line slopes upward, indicating a positive relationship between x and y.
For the 'Data 2' the equation is y = 13  2x ; the intercept is 13 and the slope is 2. The
line slopes downward, indicating a negative relationship between x and y.
Plot for Data 1
Plot for Data 2
y = 3 + 2 x
y = 13  2 x
The relationship between x and y is 'perfect' for these two examples—the points fall
exactly on a straight line or the value of y is determined exactly by the value of x. Our
interest will be concerned with relationships between two variables which are not
perfect. The 'Correlation' between x and y is r = 1.00 for the values of x and y on the
left and r = 1.00 for the values of x and y on the right.
Regression analysis is concerned with finding the 'best' fitting line for predicting
the average value of a response variable y using a predictor variable x.
Example. (Made up data).
Predict the average
score 'hangover index' (y), for students who drink x beers the night before. Obtain x
and y for 5 people (all of whom sit in the back of the room in class the next day). The
data, with predicted values (FITS) and the amount predictions differ from observed
values (RESI1):are also given.
x
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 03/30/2008 for the course STAT 200 taught by Professor Barroso,joaor during the Spring '08 term at Pennsylvania State University, University Park.
 Spring '08
 BARROSO,JOAOR
 Statistics, Correlation

Click to edit the document details