21_Assumptions_handout

21_Assumptions_handout - Assumptions and properties 73-261...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Assumptions and properties 73-261 Econometrics October 11 Reading: Wooldridge 3.3-3.5, 5 Overview Done with topic 1 – Mechanics of OLS Estimate, test, predict Reducing lots of numbers (data) to a few ones (coefficients, p-values) Topic 2: Assumptions and violations OLS is mechanical – numbers in, numbers out Interpreting numbers as answers to economic questions requires assumptions about data Reality can and does violate these assumptions You need to know how to fix violations Or at least be aware of the effects 73-261 2.1 Assumptions p. 2 © CMU / Y. Kryukov Plan for today Assumptions A1, A2: Population & data A3: Linear indepedence A4,A5: Mean & Variance A6: Normality of u Properties – what assumptions give us P1: P2: P3: P4: 73-261 2.1 Assumptions Unbiasedness Efficiency Consistency ˆ Normality of β p. 3 © CMU / Y. Kryukov Assumption A1: Population The true (population) relationship is: y = xβ + u . . . – constant coefficients . . . – independent variable(s) . . . – residual, random variable includes any factors of y that are not in x . . . – dependent variable, determined by x, β , u In reality, things are rarely linear But we can deal with non-linearity (log, x2) Significance and sign of effects . . . . . . . . . 73-261 2.1 Assumptions p. 4 © CMU / Y. Kryukov Assumption A2: Data Data Xi, Yi are randomly drawn from the . . . . . . . . . . . . Mostly a trivial assumption “data are what we think they are” Violations include selection issues Data come from a sub-group of population Selection of certain sub-set of candidates E.g. . . . . . . . . . Violations might also violate A4 or A5 73-261 2.1 Assumptions p. 5 © CMU / Y. Kryukov A3: Linear independence No linear dependence in the columns of X Otherwise: X’X is singular, i.e. (X’X)−1 does not exist Linear dependence = . . . . . . . . . . . . . . . . . . Xi,k ≡ γ0 + γ1Xi,1 + γ2Xi,2 + … + γk−1Xi,k−1 Variable with same values for all observations (we have a constant already) Re-scaled version of another variable log(x2) = 2 log(x) Variable = sum of others (. . . . . . . . . . . . . . . . . . . . .) 73-261 2.1 Assumptions p. 6 © CMU / Y. Kryukov A3 and multicollinearity A3 violation is a “knife edge” effect Imagine we have Xi,k ≡ Xi,1 – a violation Take a very small number ε (ε = 0.0000001) Add ε to Xi,k for a single observation i0 Relationship broken; violation “fixed” But Xi,k and Xi,1 remain closely correlated This is called multicollinearity (X’X)−1 exists, but contains . . . . . . This makes standard errors high, making both xk and x1 . . . . . . We will talk more about it in the next class 73-261 2.1 Assumptions p. 7 © CMU / Y. Kryukov A4: Expectation of the error E[u|x] = . . . I.e. error term has zero mean . . . . . . . . . Violations: Omitting a relevant variable that is correlated to an included variable . . . . . . . . . . . . – x and u are determined simultaneously and depend on each other Selection issues – observations were “selected” differently for different x’s. Solution: Instrumental Variables (topic 2.3) 73-261 2.1 Assumptions p. 8 © CMU / Y. Kryukov A4 violation: omitted variable True model: y = β0 + β1 x1 + β2 x2 + u y = wage x1 = education, x2 = IQ IQ and education are correlated: E[x2 | x1] ≠ E[x2] If we estimate we are using y = β0 + β1 x1 + w w = β2 x2 + u Now compute E[w | x1]: E[w | x1] = β2 E[x2 | x1] + E[u | x1] ≠0 =0 73-261 2.1 Assumptions p. 9 © CMU / Y. Kryukov A5: Variance of the error V[u|x] = . . . Same variance of u for any x Violations (topic 2.4 Heteroskedasticity): V[u|x] scales with x = OLS gives more weight to . . . . . . . . . More broadly, we also assume that errors are independent across observations cov(Ui,Ul) = 0 often violated in time series (Autocorrelation) i.e. present correlated . . . . . . . . . 73-261 2.1 Assumptions p. 10 © CMU / Y. Kryukov A6: Normal errors The Ultimate Assumption: u ~ Normal[0,σ 2] i.i.d. across observations Replaces A . . . . . . . . . . . . Very strong: There are many distributions with zero mean and variance of σ 2 Yet we picked only one ˆ We can test it by looking at distribution of U i ' s We have a way to . . . . . . . . . . . . 73-261 2.1 Assumptions p. 11 © CMU / Y. Kryukov Property 1: Unbiasedness Under the assumptions A1-A4, we have: ˆ Eβ = β ˆ I.e. β is an . . . . . . . . . . . . estimate of β Unbiased estimate – interpretation: Keep number of observations N fixed Repeat estimate with many different datasets Average of estimates will be . . . . . . . . . . . . “Poll of polls” is better than a single poll same for “most studies find that …” 73-261 2.1 Assumptions p. 12 © CMU / Y. Kryukov Property 2: Efficiency Under A1-A5, the OLS estimate ˆ β = ( X ' X ) −1 X ' Y is B.L.U.E.: B . . . . . . . . . : lowest variance among L.U.E. L . . . . . . . . . . . . : (X’X)−1X’Y = ΣiWiYi ˆ U . . . . . . . . . . . . : Eβ = β E . . . . . . . . . . . . . . . : a formula (or method) for computing an estimate from the data Efficiency (in statistics) = low variance Tells you how far you are from the truth 73-261 2.1 Assumptions p. 13 © CMU / Y. Kryukov Property 3: Consistency ˆ Under A1-A4, β is a consistent estimate of β ˆ i.e. β is “converging” to β as N increases ˆ plim β = β N →∞ ˆ β is a random variable, with nonzero variance As N increases, its variance converges to . . . I.e. r.v. becomes . . . . . . . . . . . . . . . . . . . . . , and concentrates around a single number That number is the . . . . . . . . . Consistency = more data gets us closer to the truth 73-261 2.1 Assumptions p. 14 © CMU / Y. Kryukov Normality Remember the tests? They were derived from A6: u ~ Normal[0,σ 2] ˆ βj −βj ˆ se( β j ) ~ t N − k −1 We really do not want to depend on A6 We can test whether ˆ ~ Normal[0, σ 2 ] ˆ Ui Most of the time, it won’t be (skewed, fat tails) But we want to use test statistics Need to get normality elsewhere 73-261 2.1 Assumptions p. 15 © CMU / Y. Kryukov Asymptotic Normality theory Recall the Central Limit Theorem Have i.i.d. variables Z1, Z2, …, ZN, … Each has mean µ and variance σ 2 N 1 Let Z N = N ∑i =1 Z i Then ... ~ Normal[0,1] The shape of the distribution of approaches . . . . . . . . . . . . ∑ N i =1 Zi The rest is just shifting (centering) and scaling (normalizing) to ensure mean = 0, variance = s.d. = 1 73-261 2.1 Assumptions p. 16 © CMU / Y. Kryukov P4: Asymptotic Normality Under A1-A5, we have ˆ ⎡β j − β j ⎤ alim ⎢ ⎥ ~ ... ˆ⎥ N →∞ se( β ) ⎢ j⎦ ⎣ I.e. can do tests using . . . . . . . . . instead of Student’s t alim t N − k −1 ~ Normal[0,1] N →∞ 73-261 2.1 Assumptions p. 17 © CMU / Y. Kryukov Summary Property \ Assm A1 A2 A3 A4 A5 A6 P1: Unbiasedness P2: Efficiency P3: Consistency P4’: Normality P4: Asympt. Normality A1-2 are inherent in OLS A3 is easy to check, multicollinearity is not A4 (E[u|x] = 0) ruins everything – fix via IV’s ˆ A5 (V[u|x] = σ 2) affects efficiency and s.e.(β ) A6 is too strong, we avoid using it 73-261 2.1 Assumptions p. 18 © CMU / Y. Kryukov ...
View Full Document

This note was uploaded on 01/21/2011 for the course ECON 73-261 taught by Professor Kyrkv during the Fall '09 term at Carnegie Mellon.

Ask a homework question - tutors are online