If R =q, that is, if the null hypothesis is true, then Rbq=RbR =R(b) = R(XX)1X. [See (4-4).] Let
C=[R(XX1R] since R(b) =R(XX)1X
=D , the numerator of F equals [(/)T(/)]/J
where T = DC1D. The numerator is W/J from (6-5) and is distributed as 1/J times a c

regression having a different dependent variable from the unrestricted one. In the preceding regression,
the dependent variable in the unrestricted regression is lnY, whereas in the restricted regression, it is
lnY lnL. The R2 from the restricted regressi

where R2Xz is the new R2 after z is added, R2 X is the original R2 and r yz is the partial correlation
between y and z, controlling for x. So, as we knew, the t improves (or, at the least, does not
deteriorate). In deriving the partial correlation coefcie

Rbq = m. It is unlikely that m will be exactly 0. The statistical question is whether
thedeviationofmfrom0canbeattributedtosamplingerrororwhetheritissignicant. Since b is normally
distributed [see (4-8)] and m is a linear function of b, m is also normally

lead to a loss of t. We can then ascertain whether this loss of t results merely from
samplingerrororwhetheritissolargeastocastdoubtonthevalidityoftherestrictions.
Wewillconsiderthesetwoapproachesinturn,thenshowthat(asonemighthope)within the framework of

One common approach to testing a hypothesis is to formulate a statistical model that contains the
hypothesis as a restriction on its parameters. A theory is said to have
testableimplicationsifitimpliessometestablerestrictionsonthemodel.Consider,for exampl

CHAPTER 6 Inference and Prediction 107
THEOREM 6.1 Limiting Distribution of the Wald Statistic Ifn(b) d N[0,2Q1] and if H0 : R q=0 is
true, then W = (Rbq)cfw_Rs2(XX)1R1(Rbq) = JF d 2[J]. Proof: Since R is a matrix of constants
and R =q, nR(b) =n(Rbq) d N[

curve reaches its minimum. That is the point at which ( lnC/ ln Q)|Q=Q = 1 or Q = exp[(1)/].
The estimated regression model using the Christensen
Greene-50240 book June 3, 2002 9:59
92 CHAPTER 5 Large-Sample Properties
and Greene 1970 data are as follows,

Theauthorsalsousedadifferentestimationapproach.Recalltheissueofselection bias caused by unmeasured
effects. The authors reformulated their model as yij = 1 +2 agei +3 age2 i +4Sij(j)+6 sexi +7 racei
+i +ij Unmeasured latent effects, such as ability, are c

an explicit formulation may be obtained: Var[b|X]= 2(XX)1 2(XX)1R[R(XX)1R]1R(XX)1. (6-15)
Thus, Var[b|X]=Var[b|X]a nonnegative denite matrix. 3Since is not restricted, we can formulate
the constraints in terms of 2. Why this scaling is convenient will be

whichissubstantiallylessthanthecriticalvaluegivenearlier.Wewouldnotrejectthehypothesis; the data are
consistent with the hypothesis of constant returns to scale. The equivalent test for the translog model
would be 2 +3 =1 and 4 +5 +26 =0. The F statistic

CHAPTER 6 Inference and Prediction 105
the exact distributions of these statistics depend on the data and the parameters and are not F, t, and
chi-squared. At least at rst blush, it would seem that we need either a new set of critical values for the
tests

are consistent with the theory, that is, those for which 3 = 2. This subset of values is contained within
the unrestricted set. In this way, the models are said to be nested.
Consideradifferenthypothesis,investorsdonotcareaboutination.Inthiscase,the small

4. In the discussion of the instrumental variables estimator we showed that the least squares estimator b
is biased and inconsistent. Nonetheless, b does estimate something: plimb = = +Q1. Derive the
asymptotic covariance matrix of b, and show that b is a

CHAPTER 6 Inference and Prediction 99
To form the appropriate test statistic, we require the standard error of q = b2 + b3,which is
se( q) =[0.003192 +0.002342 +2(3.718106)]1/2 =0.002866. The t ratio for the test is therefore t =
0.00860+0.00331 0.002866

106 CHAPTER 6 Inference and Prediction
the true distribution of the test statistic tk and use the critical values from the standard normal
distribution for testing hypotheses.
Theresultintheprecedingparagraphisvalidonlyinlargesamples.Formoderately sized s

3. A set of the coefcients sum to one, 2 +3 +4 =1, R=[0 1 1 1 0] and q=1. 4. A subset of the
coefcients are all zero, 1 =0,2 =0, and 3 =0, R= 1000 0 0100 0 0010 0 =[I : 0] and q=
0 0 0 . 5. Several linear restrictions, 2 +3 =1,4 +6 =0 and 5 +6 =0, R= 011

rms in the sample? 7. The consumption function used in Example 5.3 is a very simple specication. One
mightwonderifthemeagerspecicationofthemodelcouldhelpexplainthending
intheHausmantest.ThedatasetusedfortheexamplearegiveninTableF5.1.Use these data to carr

One way to interpret this reduction in variance is as the value of the information contained in the
restrictions. Notethattheexplicitsolutionfor involvesthediscrepancyvectorRbq.Ifthe
unrestrictedleastsquaresestimatorsatisestherestriction,theLagrangeanmult

InsertingthesevaluesinFyields F =109.84.The5percentcriticalvaluefor F[3, 199]fromthe table is 2.60. We
conclude, therefore, that these data are not consistent with the hypothesis. The result gives no
indication as to which of the restrictions is most inue

(6-2).Thenanequivalentwaytotest H0 wouldbetottheinvestmentequationwithboththe
realinterestrateandtherateonationasregressorsandtotestourtheorybysimplytesting the hypothesis that
3 equals zero, using the standard t statistic that is routinely computed. When

Exercises 1. For the classical normal regression model y = X + with no constant term and K regressors,
what is plim F[K,nK]=plim R2/K (1R2)/(nK) , assuming that the true value of is zero? 2. Let ei be the
ith residual in the ordinary least squares regress

we consider a set of linear restrictions of the form r111 +r122 +r1KK = q1 r211 +r222 +r2KK
= q2 . . . rJ11 +rJ22 +rJKK = qJ. These can be combined into the single equation R =q.
EachrowofRisthecoefcientsinoneoftherestrictions.ThematrixRhasKcolumnsto
beco