Unformatted text preview: Assumptions and properties
73261 Econometrics
October 11
Reading: Wooldridge 3.33.5, 5 Overview
Done with topic 1 – Mechanics of OLS
Estimate, test, predict
Reducing lots of numbers (data)
to a few ones (coefficients, pvalues) Topic 2: Assumptions and violations
OLS is mechanical – numbers in, numbers out
Interpreting numbers as answers to economic
questions requires assumptions about data
Reality can and does violate these assumptions
You need to know how to fix violations
Or at least be aware of the effects
73261
2.1 Assumptions p. 2
© CMU / Y. Kryukov Plan for today
Assumptions
A1, A2: Population & data
A3: Linear indepedence
A4,A5: Mean & Variance
A6: Normality of u Properties – what assumptions give us
P1:
P2:
P3:
P4:
73261
2.1 Assumptions Unbiasedness
Efficiency
Consistency
ˆ
Normality of β
p. 3
© CMU / Y. Kryukov Assumption A1: Population
The true (population) relationship is: y = xβ + u . . . – constant coefficients
. . . – independent variable(s)
. . . – residual, random variable
includes any factors of y that are not in x
. . . – dependent variable, determined by x, β , u In reality, things are rarely linear
But we can deal with nonlinearity (log, x2)
Significance and sign of effects . . . . . . . . .
73261
2.1 Assumptions p. 4
© CMU / Y. Kryukov Assumption A2: Data
Data Xi, Yi are randomly drawn
from the . . . . . . . . . . . .
Mostly a trivial assumption
“data are what we think they are” Violations include selection issues
Data come from a subgroup of population
Selection of certain subset of candidates
E.g. . . . . . . . . . Violations might also violate A4 or A5
73261
2.1 Assumptions p. 5
© CMU / Y. Kryukov A3: Linear independence
No linear dependence in the columns of X
Otherwise:
X’X is singular, i.e. (X’X)−1 does not exist Linear dependence = . . . . . . . . . . . . . . . . . .
Xi,k ≡ γ0 + γ1Xi,1 + γ2Xi,2 + … + γk−1Xi,k−1
Variable with same values for all observations
(we have a constant already)
Rescaled version of another variable
log(x2) = 2 log(x)
Variable = sum of others (. . . . . . . . . . . . . . . . . . . . .)
73261
2.1 Assumptions p. 6
© CMU / Y. Kryukov A3 and multicollinearity
A3 violation is a “knife edge” effect
Imagine we have Xi,k ≡ Xi,1 – a violation
Take a very small number ε (ε = 0.0000001)
Add ε to Xi,k for a single observation i0
Relationship broken; violation “fixed” But Xi,k and Xi,1 remain closely correlated
This is called multicollinearity
(X’X)−1 exists, but contains . . . . . .
This makes standard errors high,
making both xk and x1 . . . . . . We will talk more about it in the next class
73261
2.1 Assumptions p. 7
© CMU / Y. Kryukov A4: Expectation of the error
E[ux] = . . .
I.e. error term has zero mean . . . . . . . . .
Violations:
Omitting a relevant variable
that is correlated to an included variable
. . . . . . . . . . . . – x and u are determined
simultaneously and depend on each other Selection issues – observations were “selected”
differently for different x’s. Solution: Instrumental Variables (topic 2.3)
73261
2.1 Assumptions p. 8
© CMU / Y. Kryukov A4 violation: omitted variable
True model: y = β0 + β1 x1 + β2 x2 + u y = wage
x1 = education, x2 = IQ
IQ and education are correlated:
E[x2  x1] ≠ E[x2] If we estimate
we are using y = β0 + β1 x1 + w
w = β2 x2 + u Now compute E[w  x1]: E[w  x1] = β2 E[x2  x1] + E[u  x1]
≠0
=0 73261
2.1 Assumptions p. 9
© CMU / Y. Kryukov A5: Variance of the error
V[ux] = . . .
Same variance of u for any x
Violations (topic 2.4 Heteroskedasticity):
V[ux] scales with x
= OLS gives more weight to . . . . . . . . . More broadly, we also assume that
errors are independent across observations
cov(Ui,Ul) = 0
often violated in time series (Autocorrelation)
i.e. present correlated . . . . . . . . .
73261
2.1 Assumptions p. 10
© CMU / Y. Kryukov A6: Normal errors
The Ultimate Assumption:
u ~ Normal[0,σ 2] i.i.d. across observations Replaces A . . . . . . . . . . . .
Very strong:
There are many distributions
with zero mean and variance of σ 2
Yet we picked only one
ˆ
We can test it by looking at distribution of U i ' s We have a way to . . . . . . . . . . . .
73261
2.1 Assumptions p. 11
© CMU / Y. Kryukov Property 1: Unbiasedness
Under the assumptions A1A4, we have:
ˆ
Eβ = β ˆ
I.e. β is an . . . . . . . . . . . . estimate of β
Unbiased estimate – interpretation:
Keep number of observations N fixed
Repeat estimate with many different datasets
Average of estimates will be . . . . . . . . . . . . “Poll of polls” is better than a single poll
same for “most studies find that …”
73261
2.1 Assumptions p. 12
© CMU / Y. Kryukov Property 2: Efficiency
Under A1A5, the OLS estimate
ˆ
β = ( X ' X ) −1 X ' Y
is B.L.U.E.:
B . . . . . . . . . : lowest variance among L.U.E.
L . . . . . . . . . . . . : (X’X)−1X’Y = ΣiWiYi
ˆ
U . . . . . . . . . . . . : Eβ = β
E . . . . . . . . . . . . . . . : a formula (or method)
for computing an estimate from the data Efficiency (in statistics) = low variance
Tells you how far you are from the truth 73261
2.1 Assumptions p. 13
© CMU / Y. Kryukov Property 3: Consistency
ˆ
Under A1A4, β is a consistent estimate of β
ˆ
i.e. β is “converging” to β as N increases
ˆ
plim β = β
N →∞
ˆ
β is a random variable, with nonzero variance
As N increases, its variance converges to . . .
I.e. r.v. becomes . . . . . . . . . . . . . . . . . . . . . ,
and concentrates around a single number
That number is the . . . . . . . . . Consistency = more data gets us
closer to the truth
73261
2.1 Assumptions p. 14
© CMU / Y. Kryukov Normality
Remember the tests?
They were derived from
A6: u ~ Normal[0,σ 2] ˆ
βj −βj
ˆ
se( β j ) ~ t N − k −1 We really do not want to depend on A6
We can test whether ˆ ~ Normal[0, σ 2 ]
ˆ
Ui Most of the time, it won’t be (skewed, fat tails) But we want to use test statistics
Need to get normality elsewhere 73261
2.1 Assumptions p. 15
© CMU / Y. Kryukov Asymptotic Normality theory
Recall the Central Limit Theorem
Have i.i.d. variables Z1, Z2, …, ZN, …
Each has mean µ and variance σ 2
N
1
Let Z N = N ∑i =1 Z i
Then ... ~ Normal[0,1]
The shape of the distribution of
approaches . . . . . . . . . . . . ∑ N i =1 Zi The rest is just shifting (centering)
and scaling (normalizing) to ensure
mean = 0, variance = s.d. = 1
73261
2.1 Assumptions p. 16
© CMU / Y. Kryukov P4: Asymptotic Normality
Under A1A5, we have
ˆ
⎡β j − β j ⎤
alim ⎢
⎥ ~ ...
ˆ⎥
N →∞ se( β )
⎢
j⎦
⎣
I.e. can do tests using . . . . . . . . .
instead of Student’s t alim t N − k −1 ~ Normal[0,1]
N →∞ 73261
2.1 Assumptions p. 17
© CMU / Y. Kryukov Summary
Property \ Assm A1 A2 A3 A4 A5 A6 P1: Unbiasedness
P2: Efficiency
P3: Consistency
P4’: Normality
P4: Asympt. Normality A12 are inherent in OLS
A3 is easy to check, multicollinearity is not
A4 (E[ux] = 0) ruins everything – fix via IV’s ˆ
A5 (V[ux] = σ 2) affects efficiency and s.e.(β )
A6 is too strong, we avoid using it
73261
2.1 Assumptions p. 18
© CMU / Y. Kryukov ...
View
Full
Document
This note was uploaded on 01/21/2011 for the course ECON 73261 taught by Professor Kyrkv during the Fall '09 term at Carnegie Mellon.
 Fall '09
 Kyrkv
 Econometrics

Click to edit the document details