Green // Statistics
The Mechanics of Multiple Regression
One of the most important concepts in statistics is the idea of “controlling” for a variable.
This lecture is designed to give you a feel for what “controls” are and how they are
implemented in the context of multiple regression.
Let’s begin by considering an example.
In the weeks leading up to the November 2003
election, a group called ACORN sought to bolster support for a ballot proposition in
Kansas City.
The measure authorized a rise in sales tax in order to fend off cuts to public
transportation.
ACORN canvassed voters in a predominantly black section of Kansas
City, targeting registered voters who had voted in at least one of the five most recent
elections.
The campaign consisted primarily of doortodoor canvassing conducted
during the final two weeks before Election Day.
I was asked to evaluate the effectiveness of this campaign.
ACORN identified 28
precincts of potential interest to their campaign; I randomly assigned 14 to the treatment
group and 14 to the control group.
After the election, voter turnout records were
gathered.
Voting rates among those living in the treatment and control precincts were
calculated.
The data may be found at
Kansas City Dataset
The data may be modeled in a few different ways.
The simplest model describes the
voter turnout rate (Y) as a linear function of the experimental treatment (X) plus a
disturbance term:
Y = a + bX + U.
Here is an “individual value plot” of the data.
Note that all of the X values are either 0
(control) or 1 (treatment), but the plot scatters them a bit in order to make the individual
values easier to see.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
TREATMEN
VOTE03
1.00
0.00
0.50
0.45
0.40
0.35
0.30
0.25
0.20
I ndividual Value Plot of VOTE03 vs TREATMEN
Using regression, we obtain the following results:
Regression Analysis: VOTE03 versus TREATMEN
The regression equation is
VOTE03 = 0.289 + 0.0355 TREATMEN
Predictor
Coef
SE Coef
T
P
Constant
0.28884
0.01778
16.24
0.000
TREATMEN
0.03554
0.02515
1.41
0.169
S = 0.0665291
RSq = 7.1%
RSq(adj) = 3.6%
The critical numbers here are .036, which suggests that the expected rate of turnout
increases by 3.6 percentagepoints as we move from control to treatment, and .025, which
conveys the uncertainty surrounding this experimental effect.
The pvalue of .169 tells us
that there is a 16.9% chance of observing a treatment effect as large as this in absolute
value even if the true experimental effect were zero.
Ordinarily, we would use a 1tailed
test here, because one would suppose that canvassing would increase turnout; in that
case, the onetailed pvalue is approximately .09.
For what it’s worth, that falls a bit
short of the conventional statistical significance threshold of .05.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '05
 JonathanReuningSchererDonaldGreen
 Statistics, Linear Regression, Regression Analysis, Errors and residuals in statistics, VOTEAVG

Click to edit the document details