day5lm - 2/11/12 PADP 8130: Linear Models ...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 2/11/12 PADP 8130: Linear Models Hypothesis Tes,ng Angela Fer9g, Ph.D. •  Up un9l now, we’ve been geDng es9mates for parameters. –  We’re convinced that OLS is a good way of geDng an es9mate. –  We think mul9ple regression is a great way to control for confounding factors. •  Now, we want to be able to say how confident we are that our specific OLS es9mates are the same or different from hypothesized parameter values (mostly 0). 1 2/11/12 Plan •  Standard error of the OLS es9mator •  Confidence Intervals •  Hypothesis Tes9ng Procedure –  T- test –  F- test –  Chi- square test (Wald) Measures of dispersion of the popula9on distribu9on •  Variance ( 1n σ = ∑ Xi − X n i=1 2 •  Standard Devia.on (square root of variance) ) 2 __ ⎛ X − X⎞ ∑⎝ i ⎠ i =1 n σ= 2 n 2 2/11/12 Popula9on vs. Sampling Distribu9on •  •  •  •  If we took lots of samples of a popula9on, and calculated some sta9s9c (mean or OLS es9mate) from each sample, we would have a distribu9on of sample sta9s9cs, or the sampling distribu.on. Due to averaging the sample sta9s9c does not vary as widely as the individual observa9ons. Moreover, if we took lots of samples, then the distribu9on of the sample means would be centered around the popula9on mean (unbiased). As the sample size (number of samples) increases, the sampling distribu9on looks more and more like a normal distribu9on (central limit theorem). Population distribution Mean of population Mean of all sample means Sampling distribution Dispersion of the sampling distribu9on •  Sampling distribu9ons that are 9ghtly clustered will give us a more accurate es9mate on average than those that are more dispersed. •  We need to es9mate the dispersion of our sampling distribu9on so that we know how good our sta9s9c is. à༎ The standard error is the standard devia9on of the sampling distribu9on. 3 2/11/12 Standard error of b •  If we took lots of separate samples and then calculated lots of separate regression lines, we would get a distribu9on of slope coefficients b. •  The sampling distribu9on of b is normal if the sample size is large, and the mean of all the possible b’s is β. •  The formula for the standard error of b is: s Standard error of b = ∑ (X − X ) ∑ (Y − Yˆ ) = ∑ e = 2 2 where s = i n−k = s (X'X)-1 2 i n−k e'e n−k ˆ and Y = Xb. Square root of variance Don’t know parameter σ, so must es9mate s instead. Interpreta9on of SE The standard error of a point es9mate gives you the varia9on in the sampling distribu9on around the point so that you can: –  Give a confidence interval –  Conduct tests of hypotheses 4 2/11/12 Confidence Interval •  A confidence interval for an es9mate is a range of numbers within which the parameter is likely to fall •  We can use the standard error to produce such a range: estimate ± ( z × standard error) •  z is the confidence coefficient and is chosen to determine what is meant by “likely” to contain the actual value of the es9mate (usually close to 1, like 0.95 or 0.99) –  Since the sampling distribu9on is normal, we know the values of z that correspond to the probability of any propor9on Normal distribu9ons –  Have a bell- shape –  Are symmetrical –  Follow the empirical rule: The probability of falling within z standard devia9ons of the mean is: Confidence z 68% 95% 99% 99.9% 1.00 1.96 2.58 3.29 5 2/11/12 Example With 95% confidence: β = b ± 1.96 × SEb Example: If b=0.5 and SE of b=0.2, then β = 0.5 ± 1.96 * 0.20 = 0.5 ± 0.39 So, with 95% confidence, the slope of our line (the parameter) will lie between 0.11 and 0.89. Graph of confidence interval True beta = 6 beta Estimated b=8.5 95% confidence interval (z=1.96) 6 2/11/12 More samples That’s just one sample. Let’s imagine that we took many samples. Then, we calculated 95% confidence intervals for all of the sample means. Graph of confidence interval True beta= 6 beta 7 2/11/12 Interpreta9on •  Of the 7 samples, all of the confidence intervals around the es9mated coefficient included the actual true beta except for one. •  If we took more samples, we would expect that 95% of the confidence intervals to include the actual beta. –  95% because that’s the confidence coefficient we picked. Exact confidence coefficient for small sample sizes •  Because we don’t know the popula9on standard devia9on and must use the sample standard devia9on to get the es9mated standard error, there is error, especially when the sample size is small. •  To account for this error, for small n, we should use the t- distribu.on, not the normal distribu9on, to es9mate the confidence interval. The t- distribu9on has faker tails than the normal distribu9on. •  There are tables that give these scores for different confidence levels and different degrees of freedom (df=n- 1). The t- distribu9on looks almost exactly like the normal distribu9on for large df. 8 2/11/12 t- distribu9on graph t- distribu9on table Confidence t(df=1) t(df=10) t(df=30) t(df=100) z 90% 95% 99% 1.81 2.23 3.17 1.70 2.04 2.75 1.66 1.98 2.63 6.31 12.71 63.66 1.65 1.96 2.58 9 2/11/12 Controlling the confidence interval •  Choose a different confidence level. –  If we picked 99% confidence instead, the interval would be larger. –  If we picked 90% confidence, the interval would be narrower, but we would be wrong more olen. •  Change the sample size. –  The bigger the sample size, the lower the standard error and therefore the smaller the confidence interval for a given probability. Now Hypothesis Tes9ng 10 2/11/12 What is a hypothesis? •  A hypothesis is a testable statement about the world, usually a predic9on that some parameter takes a par9cular numerical value. –  We test hypotheses by akemp9ng to see if they could be false, rather than “proving” them to be true. –  E.g. You cannot prove that all swans are white by coun9ng white swans, but you can prove that not all swans are white by coun9ng one black swan. •  We generate hypotheses from a combina9on of theory, past empirical work, qualita9ve research, common sense, and anecdotal observa9ons about the world. Null and alterna9ve hypotheses •  When we’re tes9ng hypotheses, we want to choose between 2 conflic9ng statements: –  Null hypothesis (H0) is directly tested. •  This is a statement that the parameter we are interested in has a value similar to no effect; e.g. rich and poor people are equally likely to have a regular place for medical care –  Alterna9ve hypothesis (Ha) contradicts the null hypothesis. •  This is a statement that the parameter falls into a different set of values than those predicted by H0. e.g. rich and poor people have different probabili9es of having a regular place for care 11 2/11/12 Two- Sided vs. One- Sided Tests •  One- sided test: H0: Paler=16; Ha: Paler<16 •  Two- sided test: H0: Paler=16; Ha: Paler≠16 •  Two- sided tests are the conven9on because: –  Makes it even more difficult to find results due to chance –  We normally don’t have strong prior informa9on about the difference –  Two- tailed tests appear more objec9ve (not influenced by your beliefs about the direc9on) T- score •  We olen use the t- score instead of the z- score as our test sta9s9c because by using s to es9mate σ in the standard error introduces addi9onal error. •  This is why significance tests are olen called t- tests. •  This is especially important if n<30 or 40. To use this, we need to assume that the popula9on distribu9on is normal. b − β t= seb or more generally, if H0: Rb=q t= Rb - q s 2 R(X'X)-1 R' 12 2/11/12 Interpre9ng hypothesis tests •  We never accept the null hypothesis. •  We either reject or fail to reject based on our p- value: –  Make judgment that p- values of, say, 5% and below are probably good evidence that the null hypothesis can be rejected. •  May fail to reject null hypothesis because the null hypothesis is true or because of: –  Small sample size –  Inappropriate research design –  Biased sample –  Etc. Example •  We usually want to test whether β=0 in the popula9on. •  So, we calculate the t- sta9s9c – how many SEs from zero is the b? Then get the p- value. Estimate − Null hypothesis 0.5 − 0 t= = = 2.5 Standard error 0.2 Rule of thumb: t- stat>|2| is significantly different from 0. Pr estimate being more than 2.5 SEs higher than the null) = 0.007 ( –  Thus, for a 2- sided test, there is only a 1.4% chance that we would get this es9mate if the null hypothesis is true. –  So, we can reject the null hypothesis that β=0. That is, the effect is significantly different from zero. 13 2/11/12 Steps for Hypothesis Test 1.  Check assump9ons (i.e. normality, sample size) 2.  State hypotheses – null and alterna9ve, one- sided or two- sided 3.  Calculate appropriate test sta.s.c (summary of how far es9mate falls from the parameter value in H0 e.g. t- score) 4.  Calculate associated p- value (probability that es9mate is equal to parameter value in H0) 5.  Interpret the result Type I and Type II errors •  A Type I error occurs when we reject H0, even though it is true. –  This is going to happen 5% of the 9me if we choose to reject H0 when the p- value is less than 5%. •  A Type II error occurs when we do not reject H0, even though it is false. –  Some9mes there is a real difference, but we don’t detect it. 14 2/11/12 Trade- offs •  There is a trade- off between the two types of error. The more stringent the significance level: –  The more difficult to detect a real effect (more Type II error), –  But the more confident we can be that when we find an effect it is real (less type I error). •  Depending on what we are doing, we may be more willing to accept one sort or the other. –  Analogous to a legal trial. We don’t want the guilty to go free (a type II error), but we’d be even unhappier if we execute an innocent person (a type I error). F- test •  Purpose: An F- test is used to test joint hypothesis. A joint hypothesis tests hypotheses on more than one coefficient at the same 9me (e.g. β1=0 & β2=2, or β2 = β3) •  How the F- test works: Instead of comparing b to β as in a t- test, the F- test approach es9mates –  An unconstrained model and –  A constrained model with the hypotheses imposed –  And then compares the sum of the squared errors from the 2 models –  The errors will be smaller in the unconstrained model, but if the difference is large, then the constraints are unlikely to be true, and we reject the null hypothesis that the constraints are true. 15 2/11/12 F- sta9s9c (expressed 2 ways) F= [ e C'e C - e UC'e UC ] / J e UC'e UC / ( N − K ) or where F = ( Rb − q ) '(s 2 R( X ' X )−1 R ')−1 ( Rb − q ) / J ~ F ( J, N − K ) eC is the error term in the constrained model eUC is the error term in the unconstrained model J is the number of linear restrictions N is the number of observations K is the number of parameters (including the intercept) in the unconstrained model H 0 : Rb = q How to use the F- sta9s9c •  F is distributed as an F random variable with (J, N- K) degrees of freedom. •  We will reject the null if F is sufficiently large, where large is determined by the chosen significance level (using an F table). •  If the null was that a set of coefficients are 0, and it is rejected, then we say that x1, x3, x4, and x7 (whichever were part of the hypothesis) are jointly sta.s.cally significant. 16 2/11/12 Example H0: The slopes of the race dummy variables will be 0. Ha: The slopes will not be 0; race makers. Unconstrained model: y=Za+Xb+e where Z includes race variables. Constrained model: y=Xb+e. e’UCeUC = 5473.198 e’CeC = 5573.575 J=2 N=8187 K=9 F2,8178 = (5573.575 − 5473.198) / 2 50.1885 = = 74.99 5473.198 / (8187 − 9 ) .6693 Basically 0 probability that the null is true. We reject the null; we cannot reject the hypothesis that race makers. Chi- square tests •  Purpose: Chi- square tests are also used to test joint hypotheses, but can also test hypotheses involving non- linear restric.ons. –  Example of non- linear restric9on: H0: βfemale/βprivate=1 •  Three main types of chi- square tests: –  Wald test – we will focus on this test –  Likelihood Ra9o –  Lagrange Mul9plier (or score test) 17 2/11/12 Wald Test How the Wald test works: The Wald test es9mates –  An unconstrained model (but not a constrained model) –  The hypothesis is H0: g(b)=0, where g(b) can be a non- linear set of restric9ons; in the linear case, g(b)=Rb- q, where •  b includes the es9mated coefficients •  R is matrix indica9ng the coefficients in the hypotheses •  q is a vector of their hypothesized values. –  The Wald tests whether g(b) is close to 0; if g(b) is very different from 0, then the constraints are unlikely to be true, and we reject the null hypothesis that the constraints are true. Wald sta9s9c (expressed 2 ways) W= N [ e C'e C - e UC'e UC ] e UC'e UC or W = g(b)'(s 2 ∂g(b) ∂g(b)' -1 (X'X)-1 ) g(b) ~ χ 2 ( J ) ∂b ∂b where eC is the error term in the constrained model eUC is the error term in the unconstrained model J is the number of linear restrictions N is the number of observations K is the number of parameters (including the intercept) in the unconstrained model H 0 : g(b) = 0 (general case) or Rb - q=0 (linear case) 18 2/11/12 Wald and F Wald and F sta9s9cs are very similar: F= [ e C'e C - e UC'e UC ] / J e UC'e UC / ( N − K ) W= N [ e C'e C - e UC'e UC ] e UC'e UC F→ W ~ F ( J, N ) as N → ∞ J Note: Stata uses F sta9s9c aler regress even if restric9on is non- linear; Stata will run Wald if the model is non- linear (glm or logit). 19 ...
View Full Document

This note was uploaded on 03/28/2012 for the course PADP 8130 taught by Professor Fertig during the Spring '12 term at LSU.

Ask a homework question - tutors are online