This preview shows pages 1–4. Sign up to view the full content.
STAT 509 – Sections 6.16.2:
Linear Regression
• Mostly we have studied the behavior of a single
random variable.
• Often, however, we gather data on two random
variables.
Response Variable (
Y
):
Measures the major outcome of
interest in the study (also called the
dependent
variable).
Independent Variable (
X
):
Another variable whose
value explains, predicts, or is associated with the value
of the response variable
(also called the
predictor
or the
regressor
).
• We wish to determine:
Is there a relationship between
the two r.v.’s?
• Can we use the values of one r.v. to predict the other
r.v.?
Observational Studies vs. Designed Experiments
• In observational studies, we simply measure or
observe both variables on a set of sampled individuals.
• In a designed experiment, we manipulate the
predictors (
factors
), setting them at specific values of
interest.
We then observe what values of the response
correspond to the fixed predictor values.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document Example 1 (Table 6.1):
We observe the Rockwell
Hardness (
X
) and Young’s modulus (
Y
) for seven high
density metals.
The resulting data were:
X: 41
41
44
40
43
15
40
Y: 310 340 380 317 413 62
119
Example 2 (Table 6.3):
A chemical engineering class
studied the effect of the reflux ratio (
X
) on the ethanol
concentration (
Y
) of an ethanolwater distillation.
For a
variety of settings of the reflux ratio, the ethanol
concentration was measured:
X: 20
30
40
50
60
Y: 0.446
0.601
0.786
0.928
0.950
We assume there is random error in the observed
response values, implying a probabilistic
relationship
between the 2 variables.
• Often we assume a straightline relationship between
two variables.
• This is known as simple linear regression
.
Y
i
=
0
+
1
x
i
+
i
Y
i
=
i
th response value
0
= Intercept of regression
line
x
i
=
i
th predictor value
1
= slope of regression line
i
=
i
th random error component
• We assume the random errors
i
have mean 0 (and
variance
2
), so that E(
Y
) =
0
+
1
x.
• Typically, in practice,
0
and
1
are unknown
parameters.
We estimate them using the sample data.
Fitting the Model (Least Squares Method)
• If we gather data (
X
i
,
Y
i
) for several individuals, we
can use these data to estimate
0
and
1
and thus
estimate the linear relationship between
Y
and
X
.
• First step:
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 12/13/2011 for the course STAT 509 taught by Professor Chalmers during the Fall '08 term at South Carolina.
 Fall '08
 CHALMERS
 Linear Regression

Click to edit the document details