189
Chapter 4
Describing the Relation between Two Variables
4.1 Scatter Diagrams and Correlation
1.
Univariate data measures the value of a single variable for each individual in the study.
Bivariate data measures values of two variables for each individual.
2.
A lurking variable is a variable that is related either to the response variable or to the
explanatory variable, or both, but is not measured in the study. Examples will vary. One
possibility: The number of firemen responding to a fire can be used to predict the amount
of damage done. Both variables are related to the seriousness of the fire.
3.
Two variables are positively associated if increases in the value of the explanatory variable
tend to correspond to increases in the value of the response variable.
4.
If
1
r
=
, there is a perfect positive linear relation between the variables. And the points of
the scatter diagram will lie along a straight line with positive slope.
5.
Since
r
measures only the strength and direction of
linear relationships
, obtaining
0
r
=
only means that there is no
linear
relation between the explanatory and response variable.
6.
No,
r
is not a resistant measure. This is made apparent by considering the formula for the
sample correlation coefficient. From the formula, we see that the value of
r
depends on the
mean and standard deviation, both of which are not resistant. Therefore,
r
will also be
sensitive to extreme values or outliers. Supporting examples will vary.
7.
The linear correlation coefficient can only be calculated from bivariate
quantitative
data.
The gender of a driver is a qualitative variable.
8.
The correlation coefficient is a numerical measure of the strength and direction of the linear
association between two quantitative variables, which we traditionally designate as
x
and
y
.
It is calculated as a sum of products of the
z
scores of the
x
 and
y
 components of each
data point, that is
1
1
i
i
x
y
x
x
y
y
r
n
s
s
⎛
⎞
⎛
⎞
−
−
=
⎜
⎟
⎜
⎟
⎜
⎟
−
⎝
⎠
⎝
⎠
∑
. The correlation coefficient takes values
between
1
−
and 1. For a positive linear relation, above average values of one variable (or
positive
z
scores) tend to be associated with above average values of the other variable
(positive
z
scores) and below average values of one (negative
z
scores) with below average
values of the other (negative
z
scores), and so the products will mostly be positive, giving a
positive value of
r
. Similarly, a negative value of
r
indicates a negative relation where
above average values of one (positive
z
scores) tend to be associated with below average
values of the other (negative
z
scores). The strength of the linear relation, either positive or
negative, is measured by how close the value of
r
is to 1 or
1
−
.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '11
 ahmad
 Least Squares, Regression Analysis

Click to edit the document details