A stata do le that would do everything
you need for this one hypothesis (after
checking coding of missing data to .)
clear
use data
* What are the descriptive statistics of my dependent variable in 1995?
sum redistrib_1 if year=1995&polity2~=.
* What are

60
Regression line
Incumbent vote
40
50
Mean of Y
30
Key$property$of$regression$
line:$goes$through$mean$of$X$
and$Y$
-15
-10
-5
0
GROWTH
Mean of X=.62
5
10
Interpretation$of$regression$
line:$What$is$the$average$
value$of$Y$at$different$values$
of$X$(rec

standard deviations. Implication: zero covarianc
Covariance
Here Calculating a Correlation calculating a (sample) co
is the formula for
!
Take the covariance between two variables and divide it by the product of t
standard deviations. Implication: zero co

the t-stat for the null that
the coefcient on growth is 0
R2: Variation in growth explains
33% of variation in Vote
. regress VOTE GROWTH
Source |
SS
df
MS
-+-Model | 406.285326
1 406.285326
Residual | 736.446506
30 24.5482169
-+-Total | 1142.73183
31 36.

The slope of a line: Y=mX+b
Y=Incumbent
vote share
m, the slope, or
how much Y changes
when X increases
by 1 unit
m
b, the intercept,
the value of Y
when X=0
2
3
X=GDP growth
The relationship between the economy and incumbent
success
How do you nd the rig

What to do when there is
heteroscedasticity
Robust standard errors
reg dv iv1 iv2 iv3, robust
Interpret results in exactly the same way as OLS,
but simply state that you calculated robust
standard errors to address heteroscedasticity
Heteroscedasticity an

Summary of last time
We want to infer something (e.g., the mean of X) about a population, but we only
have a sample
We can take advantage of the fact that the mean of the sample we have is related
to the sampling distribution for means of this variable in

Relationships between
variables
Categorical data: Chi-squared test
Continuous (and often ordinal) data:
Correlation (based on covariance)
Scatter plots
Difference in means test
Chi-squared test
A statistical test commonly used to compare observed data (th

Why do we do OLS regression to estimate
effects, controlling for other variables
OLS assumes certain conditions are satised (Gauss-Markov
assumptions)
When these conditions are satised, an OLS estimate of a
parameter is BLUE
BEST, LINEAR, UNBIASED ESTIMAT

Why do we do OLS regression to estimate
effects, controlling for other variables
OLS assumes certain conditions are satised (Gauss-Markov
assumptions)
When these conditions are satised, an OLS estimate of a
parameter is BLUE
BEST, LINEAR, UNBIASED ESTIMAT

Last time
The math of what it means to control for another
variable in a regression
We want to identify the covariation between X
and Y that excludes covariation of X and Z that
is also related to covariation in X and Y
The$effect$of$X,$controlling$for$Z

Spurious bivariate
relationships
There appears to be a relationship between X and
Y, but when we take into consideration the role of Z,
this relationship goes away
The relationship between height and
conservatism
r = .34
Men only
Women only
Pu

A team competition
Develop a dependent variable that measures attitudes about anything at all (not necessarily political) on an ordinal 5point scale. This requires that you dene the questions and responses that can be used in a survey of the class.
Develo

Two most important things we want to
know about a continuous variable (and
sometimes an ordinal one
Location (central tendancy)
Dispersion
Draw two distributions with same mean and
different dispersions
Range (lowest and highes value)
Variance/standard de