Using R for Linear Regression
In the following handout words and symbols in
bold
are R functions and words and
symbols in
italics
are entries supplied by the user; underlined
words and symbols are
optional entries (all current as of version R2.4.1).
Sample texts from an R session are
highlighted with gray shading.
Suppose we prepare a calibration curve using four external standards and a reference,
obtaining the data shown here:
> conc
[1] 0
10
20
30
40
50
> signal
[1]
4
22
44
60
82
The expected model for the data is
signal =
β
o
+
β
1
×conc
where
β
o
is the theoretical yintercept and
β
1
is the theoretical slope.
The goal of a linear
regression is to find the best estimates for
β
o
and
β
1
by minimizing the residual error
between the experimental and predicted signal.
The final model is
signal =
b
o
+ b
1
×conc + e
where b
o
and b
1
are the estimates for
β
o
and
β
1
and e is the residual error.
Defining Models in R
To complete a linear regression using R it is first necessary to understand the syntax for
defining models.
Let’s assume that the dependent variable being modeled is Y and that
A, B and C are independent variables that might affect Y.
The general format for a
linear
1
model is
response
~
op1
term1 op2
term 2
op3
term3…
1
When discussing models, the term ‘linear’ does not mean a straightline.
Instead, a linear model contains
additive terms, each containing a single multiplicative parameter; thus, the equations
y =
β
0
+
β
1
x
y =
β
0
+
β
1
x
1
+
β
2
x
2
y
=
β
0
+
β
11
x
2
y
=
β
0
+
β
1
x
1
+
β
2
log(x
2
)
are linear models.
The equation y =
α
x
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Documentwhere
term
is an object or a sequence of objects and
op
is an operator, such as a + or a
−
,
that indicates how the term that follows is to be included in the model.
The table below
provides some useful examples.
Note that the mathematical symbols used to define
models do not have their normal meanings!
Syntax
Model
Comments
Y ~ A
Y =
β
o
+
β
1
A
Straightline with an implicit y
intercept
Y ~ 1 + A
Y =
β
1
A
Straightline with no yintercept;
that is, a fit forced through (0,0)
Y ~ A + I(A^2)
Y =
β
o
+
β
1
A +
β
2
A
2
Polynomial model; note that the
identity function
I( )
allows terms
in the model to include normal
mathematical symbols.
Y ~ A + B
Y =
β
o
+
β
1
A +
β
2
B
A firstorder model in A and B
without interaction terms.
Y ~ A:B
Y =
β
o
+
β
1
AB
A model containing only firstorder
interactions between A and B.
Y ~ A*B
Y =
β
o
+
β
1
A +
β
2
B +
β
3
AB
A full firstorder model with a term;
an equivalent code is Y ~ A + B +
A:B.
Y ~ (A + B + C)^2
Y =
β
o
+
β
1
A +
β
2
B +
β
3
C +
β
4
AB +
β
5
AC +
β
6
AC
A model including all firstorder
effects and interactions up to the n
th
order, where n is given by
( )^n
.
An equivalent code in this case is
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 Various
 Normal Distribution, Regression Analysis, Errors and residuals in statistics, βo

Click to edit the document details