# 34 the extreme case is where k n then ssr 0 hence r 2

This preview shows pages 10–14. Sign up to view the full content.

(34) The extreme case is where k = n . Then SSR = 0, hence R 2 = 1. To penalize this, the R 2 is adjusted as: ¯ R 2 ' 1 & SSR /( n & k ) TSS /( n & 1) . (35) The reason for this particular adjustment is that under the conditions of Proposition 1, SSR whereas under the null hypothesis , TSS - χ 2 n & k , β 1 ' .... ' β k & 1 ' 0 - χ 2 n & 1 .

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
11 5. Multicollinearity Multicollinearity is the phenomenon that (some of) the explanatory variables are highly correlated. The effect of multicollinearity is that the t-values are deflated. To demonstrate this, I have generated artificial data for j = 1, .... , n = 500 as follows. The explanatory Y j , X 1, j , X 2, j variables have been drawn independently from the N (0,1) distribution. Next, I have drawn X 1, j random variables independently from the N (0,1) distribution, and have set = + V j X 2, j X 1, j Due to this construction the explanatory variables are highly correlated. In 0.01. V j . X 1, j and X 2, j particular, the R 2 of the regression of is 0.999892. Finally, I have drawn the errors X 2, j on X 1, j independently from the N (0,1) distribution, and have generated the dependent variables by U j Y j ' X 1, j % X 2, j % 1 % U j , j ' 1,..., n ' 500. (36) This is model (7) with k = 3 and β 1 ' β 2 ' β 3 ' 1. The EasyReg output involved is below. OLS estimation results Parameters Estimate t-value H.C. t-value (S.E.) (H.C. S.E.) [p-value] [H.C. p-value] b(1) 0.45258 0.101 0.104 (4.48811) (4.35703) [0.91968] [0.91727] b(2) 1.57385 0.351 0.362 (4.48634) (4.35350) [0.72573] [0.71772] b(3) 0.95879 21.386 21.410 (0.04483) (0.04478) [0.00000] [0.00000] Notes: 1: S.E. = Standard error 2: H.C. = Heteroskedasticity Consistent. These t-values and standard errors are based on White's heteroskedasticity consistent variance matrix. 3: The two-sided p-values are based on the normal approximation.
12 Effective sample size (n): 500 Variance of the residuals: 1.00464 Standard error of the residuals (SER): 1.002317 Residual sum of squares (RSS): 499.305951 (Also called SSR = Sum of Squared Residuals) Total sum of squares (TSS): 2396.64478 R-square: 0.791665 Adjusted R-square: 0.790826 Overall F test: F(2,497) = 944.29 p-value = 0.00000 Significance levels: 10% 5% Critical values: 2.31 3.01 Conclusions: reject reject Breusch-Pagan test = 3.286972 Null hypothesis: The errors are homoskedastic Null distribution: Chi-square(2) p-value = 0.19331 Significance levels: 10% 5% Critical values: 4.61 5.99 Conclusions: accept accept Note that b(1), b(2) and b(3) are the OLS estimators of respectively. β 1 , β 2 and β 3 , Observe that not only the OLS estimators of are way off from the true value 1, β 1 and β 2 but that also the corresponding t-values are deflated towards levels where the null hypotheses and cannot be rejected by the separate t tests at any reasonable significance level. β 1 ' 0 β 2 ' 0 On the other hand, the overall F test strongly rejects the joint null hypothesis that β 1 ' β 2 ' 0. These contradictory results are due to multicollinearity. There is no cure for multicollinearity. The only thing you can do is be aware of it, and always test joint hypotheses using the F or Wald test rather than using separate t tests. Assumption 1 does not rule out multicollinearity, but only its extreme form, where one explanatory variable is an exact linear function of other explanatory variables. For example, consider model (2), and suppose that without error. Then model (2) becomes Z j ' η % δ X j

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document