This preview shows page 1. Sign up to view the full content.
Unformatted text preview: EC3062 ECONOMETRICS
ELEMENTARY REGRESSION ANALYSIS
We shall consider three methods for estimating statistical parameters.
These are the method of moments, the method of least squares and the
principle of maximum likelihood.
In the case of the regression model, the three methods generate estimating equations that are identical; but the assumptions diﬀer.
Conditional Expectations
If y ∼ f (y ), then, in the absence of further information, the minimummeansquareerror predictor is its expected value
E (y ) = y f (y )dy. Proof. If π is the value of a prediction, then the meansquare error is
(1) M= (y − π )2 f (y )dy = E (y − π )2 = E (y 2 ) − 2πE (y ) + π 2 ; and, by calculus, it can be shown that M is minimised by taking π = E (y ).
1 EC3062 ECONOMETRICS
If x is related to y , then the m.m.s.e prediction of y is the conditional
expectation
E (y x) = (2) f (x, y )
dy.
y
f (x) Proof. Let y = E (y x) and let π = π (x) be any other estimator. Then,
ˆ
(5) E (y − π )2 = E (y − y ) + (ˆ − π )
ˆ
y 2 ˆy
ˆ
= E (y − y )2 + 2E (y − y )(ˆ − π ) + E (y − π )2 .
ˆ In the second term, there is
(y − y )(ˆ − π )f (x, y )∂y∂x
ˆy E (y − y )(ˆ − π ) =
ˆy
x (6) y (y − y )f (y x)∂y (y − π )f (x)∂x = 0.
ˆ
ˆ =
x y ˆ
y
ˆ
Therefore, E {(y − π )2 } = E {(y − y )2 } + E {(ˆ − π )2 } ≥ E {(y − y )2 }, and
the assertion is proved.
2 EC3062 ECONOMETRICS
The deﬁnition of the conditional expectation implies that
xyf (x, y )∂y∂x = E (xy ) =
x y yf (y x)∂y f (x)∂x = E (xy ).
ˆ x
x y When E (xy ) = E (xy ) is rewritten as E x(y − y ) = 0, it may be described
ˆ
ˆ
as an orthogonality condition. This indicates that the prediction error y −y
ˆ
is uncorrelated with x. If it were correlated with x, then we should not
be using the information of x eﬃciently in forming y .
ˆ
Linear Regression
Assume that x and y have a joint normal distribution, which implies
that there is a linear regression relationship:
(9) E (y x) = α + βx, The object is to express α and β in terms of the expectations E (x), E (y ),
the variances V (x), V (y ) and the covariance C (x, y ). 3 EC3062 ECONOMETRICS
First, multiply (9) by f (x), and integrate with respect to x to give
(10) E (y ) = α + βE (x), whence the equation for the intercept is
(11) α = E (y ) − βE (x). Equation (10) shows that the regression line passes through the expected
value of the joint distribution E (x, y ) = {E (x), E (y )}.
By putting (11) into E (y x) = α + βx from (9), we ﬁnd that
(12) E (y x) = E (y ) + β x − E (x) . Now multiply (9) by x and f (x) and integrate with respect to x to give
(13) E (xy ) = αE (x) + βE (x2 ). Multiplying (10) by E (x) gives
(14) E (x)E (y ) = αE (x) + β E (x) 4 2 , EC3062 ECONOMETRICS E (xy ) = αE (x) + βE (x2 ). (13) Multiplying (10) by E (x) gives
(14) E (x)E (y ) = αE (x) + β E (x) 2 , whence, on taking (14) from (13), we get
(15) E (xy ) − E (x)E (y ) = β E (x2 ) − E (x) 2 , which implies that
(16) β = E (xy ) − E (x)E (y )
E (x2 ) − E (x) 2 E
= x − E (x)
E y − E (y ) x − E (x) 2 = C (x, y )
.
V (x) Thus, we have expressed α and β in terms of the moments E (x), E (y ),
V (x) and C (x, y ) of the joint distribution of x and y . 5 EC3062 ECONOMETRICS
Estimation by the Method of Moments
Let (x1 , y1 ), (x2 , y2 ), . . . , (xT , yT ) be a sample of T observations.
Then, we can calculate the following empirical or sample moments:
1
x=
¯
T
(21) 1
2
sx =
T
sxy 1
=
T T xt ,
t=1 1
y=
¯
T T 1
2
(xt − x) =
¯
T
t=1
T T yt ,
t=1
T x2 − x2 ,
¯
t
t=1 1
(xt − x)(yt − y ) =
¯
¯
T
t=1 T xt yt − xy .
¯¯
t=1 To estimate α and β , we replace the population moments in the formulae
of (11) and (16) by the sample moments. Then, the estimates are
(22) α = y − β x,
ˆ ¯ ˆ¯ ˆ
β= 6 ¯
¯
(xt − x)(yt − y )
.
¯
(xt − x)2 EC3062 ECONOMETRICS
Convergence
We can expect the sample moments to converge to the true moments
of the bivariate distribution, thereby causing the estimates of the parameters to converge likewise to the true values.
(23) A sequence of numbers {an } is said to converge to a limit a if, for
any arbitrarily small real number , there exists a corresponding
integer N such that an − a < for all n ≥ N . This is not appropriate to a stochastic sequence, such as a sequence of
estimates. For, it is always possible for an to break the bounds of a ±
when n > N . The following is a more appropriate deﬁnition:
(24) A sequence of random variables {an } is said to converge weakly in
probability to a limit a if, for any , there is lim P (an − a > ) = 0
as n → ∞ or, equivalently, lim P (an − a ≤ ) = 1. With the increasing size of the sample, it becomes virtually certain that
an will ‘fall within an epsilon of a.’ We describe a as the probability limit
of an and we write plim(an ) = a.
7 EC3062 ECONOMETRICS
This deﬁnition does not presuppose that an has a ﬁnite variance or even
a ﬁnite mean. However, if an does have ﬁnite moments, then we may talk
of meansquare convergence:
(25) A sequence of random variables {an } is said to converge in mean
square to a limit a if lim(n → ∞)E {(an − a)2 } = 0. We should note that
E an − a 2 =E an − E (an ) − a − E (an ) 2 (26)
= V (an ) + E a − E (an ) 2 . Thus, the meansquare error of an estimator an is the sum of its variance
and the square of its bias. If an is to converge in mean square to a, then
both of these quantities must vanish.
Convergence in mean square implies convergence in probability.
When an estimator converges in probability to the parameter which it
purports to represent, then we say that it is a consistent estimator.
8 EC3062 ECONOMETRICS 80
75
70
65
60
55
55 60 65 70 75 80 Figure 1. Pearson’s data comprising 1078 measurements of the heights
of fathers (the abscissae) and of their sons (the ordinates), together with
the two regression lines. The correlation coeﬃcient is 0.5013.
9 EC3062 ECONOMETRICS
The Bivariate Normal Distribution
Most of the results in the theory of regression can be obtained by
examining the functional form of the bivariate normal distribution. Let
x and y be the two variables. Let us denote their means by E (x) = µx ,
2
2
E (y ) = µy , their variances by V (x) = σx , V (y ) = σy and their covariance
by C (x, y ) = ρσx σy . Here, the correlation coeﬃcient
(30) ρ= C (x, y )
V (x)V (y ) provides a measure of the relatedness of these variables.
The bivariate distribution is speciﬁed by
1
exp Q(x, y ),
(31)
f (x, y ) =
2
2πσx σy 1 − ρ
where
(32)
−1
Q=
2(1 − ρ2 ) x − µx
σx 2 − 2ρ x − µx
σx
10 y − µy
σy + y − µy
σy 2 . EC3062 ECONOMETRICS
The quadratic function can also be written as
y − µy
x − µx
−ρ
σy
σx −1
(33) Q =
2(1 − ρ2 ) 2 − (1 − ρ2 ) x − µx
σx Thus, we have
f (x, y ) = f (y x)f (x), (34)
where
(35) f (x) = σx 1
√ (x − µx )2
exp −
2
2σx
2π , and
(36) f (y x) = (y − µyx )2
exp − 2
2)
2σy (1 − ρ)2
2π (1 − ρ
1 σy with
(37) µyx ρσy
= µy +
(x − µx ).
σx
11 , 2 . EC3062 ECONOMETRICS
LeastSquares Regression Analysis
The regression equation, E (y x) = α + βx can be written as
(39) y = α + xβ + ε, where ε = y − E (y x) is a random variable, with E (ε) = 0 and V (ε) = σ 2 ,
that is independent of x.
Given observations (x1 , y1 ), . . . , (xT , yT ), the estimates are the values
that minimise the sum of squares of the distances—measured parallel to
the y axis—of the data points from an interpolated regression line:
T (40) T (yt − α − xt β )2 . ε2 =
t S=
t=1 t=1 Diﬀerentiating S with respect to α and setting to zero gives
(41) −2 (yt − α − βxt ) = 0, or y − α − β x = 0.
¯
¯ This generates the following estimating equation for α:
(42) α(β ) = y − β x.
¯
¯
12 EC3062 ECONOMETRICS
By diﬀerentiating with respect to β and setting the result to zero, we get
−2 (43) xt (yt − α − βxt ) = 0. On substituting for α from (42) and eliminating the factor −2, this becomes
xt yt − (44) xt (¯ − β x) − β
y
¯ x2 = 0,
t whence we get
(45) ˆ
β= xt yt − T xy
¯¯
=
¯
x2 − T x2
t (xt − x)(yt − y )
¯
¯
.
¯
(xt − x)2 This is identical to the estimate under (22) derived via the method of
ˆ
moments. Putting β into the equation α(β ) = y − β x of (42), gives the
¯
¯
estimate of α found under (22).
ˆ
13 EC3062 ECONOMETRICS
The method of least squares does not automatically provide an
estimate of σ 2 = E (ε2 ). To obtain an estimate, we may apply the method
t
of moments to the regression residuals et = yt − α − βxt to give
ˆˆ
σ2 =
˜ (46) 1
T e2 .
t In fact, this is a biased estimator with
(47) E σ2
˜
T T −2
T = σ2 ; so it is common to adopt the unbiased estimator
(48) e2
t
σ=
ˆ
.
T −2
2 14 EC3062 ECONOMETRICS
Properties of the LeastSquares Estimator
The disturbance term ε is assumed to be a random variable with
(49) E (εt ) = 0, and V (εt ) = σ 2 for all t. We might assume that x is a random variable uncorrected with ε such
that that C (x, ε) = 0. However, if we are prepared to regard the xt as
predetermined values which have no eﬀect on the εt , then we can say that
(50) E (xt εt ) = xt E (εt ) = 0, for all t. In place of an assumption attributing a ﬁnite variance to x, we may
assert that
(51) 1
lim(T → ∞)
T T x2 = mxx < ∞.
t
t=1 For the random sequence {xt εt }, we assert that
(52) 1
plim(T → ∞)
T
15 T xt εt = 0.
t=1 EC3062 ECONOMETRICS
To see the eﬀect of these assumptions, let us substitute the expression
(53) ¯
¯
¯
yt − y = β (xt − x) + εt − ε ˆ
in the expression for β found under (45). By rearranging the result, we
have
(54) ˆ
β=β+ ¯
(xt − x)εt
.
2
¯
(xt − x) The numerator of the second term on the RHS is obtained with the help
of the identity (55) ¯
¯
(xt − x)(εt − ε) =
= (xt εt − xεt − xt ε + xε)
¯
¯ ¯¯
¯
(xt − x)εt . From the assumption under (50), it follows that
(56) ¯
¯
E (xt − x)εt = (xt − x)E (εt ) = 0
16 for all t. EC3062 ECONOMETRICS
Therefore, (57) ˆ
E (β ) = β + (xt − x)E (εt )
¯
(xt − x)2
¯ = β;
ˆ
and β is seen to be an unbiased estimator of β .
The consistency of the estimator follows, likewise, from the assumptions under (51) and (52). Thus (58) ˆ
plim(β ) = β + plim T −1 (xt − x)εt
¯ plim T −1 ¯
(xt − x)2 = β;
ˆ
and β is seen to be a consistent estimator of β .
ˆ
The consistency of β depends crucially upon the assumption that the
disturbance term is independent of, or uncorrelated with, the explanatory
variable or regressor x.
17 EC3062 ECONOMETRICS
Example. A simple model of the economy is postulated that comprises
two equations in income y , consumption c and investment i:
(59) y = c + i, (60) c = α + βy + ε. Also, s = y − c or s = i, where s is savings. The disturbance ε, is assumed
to be independent of investment i. Substituting (60) into (59) gives
1
(61)
y=
α+i+ε ,
1−β
from which
1
¯
i
¯
it − ¯ + εt − ε .
(62)
yt − y =
1−β
The estimator of the parameter β , marginal propensity to consume is
(63) ˆ
β=β+ (yt − y )εt
¯
.
(yt − y )2
¯ ˆ
Since y is dependent on ε, according to (61), β cannot be a consistent
estimator of β .
18 EC3062 ECONOMETRICS y=c+i c, y c + i2
c + i1
c = α + βy
i2
i1 α 45 o y
y1 y2 Figure 2. If the only source of variation in y is the variation in i, then
the observations on y and c will delineate the consumption function. 19 EC3062 ECONOMETRICS c=y −i c c = α + βy + ε
c = α + βy ε
α i 45 o y
y1 y2 Figure 3. If the only source of variation in y are the disturbances
to c, then the observations on y and c will line along a 45◦ line. 20 EC3062 ECONOMETRICS
To determine the probability limit of the estimator, we must assess the
separate probability limits of the numerator and the denominator of the
term on the RHS of (63). The following results are available:
1
lim
T
(64) 1
plim
T
1
plim
T T (it − ¯)2 = mii = V (i),
i
t=1
T mii + σ 2
(yt − y )2 =
¯
= V (y ),
(1 − β )2
t=1
T σ2
= C (y, ε).
(yt − y )εt =
¯
1−β
t=1 The results indicate that
(65) σ 2 (1 − β )
βmii + σ 2
ˆ
plim β = β +
=
.
mii + σ 2
mii + σ 2 21 EC3062 ECONOMETRICS
The Method of Maximum Likelihood
The disturbance εt ; t = 1, . . . , T in the regression model are assumed to
be independently and identically distributed with a normal density:
ε2
exp − t2
N (εt ; 0, σ ) = √
2σ
2πσ 2
2 (66) 1 . Since they are assumed to be independently distributed, their joint probability density function (p.d.f.) is
T N (εt ; 0, σ 2 ) = (2πσ 2 )−T /2 exp (67)
t=1 −1
2σ 2 T ε2 . t=1 If we regard the elements x1 , . . . , xT as a given set of numbers, then it
follows that the conditional p.d.f. of the sample y1 , . . . , yT is
(68)
f (y1 , . . . , yT x1 , . . . , xT ) = (2πσ 2 )−T /2 exp 22 −1
2σ 2 T (yt − α − βxt ) .
t=1 EC3062 ECONOMETRICS
The maximum likelihood estimates α, β and σ 2 are the values that maximise the probability measure that is attributed to the sample y1 , . . . , yT .
The log likelihood function, which is maximised by these values, is
(69) T
1
T
log L = − log(2π ) − log(σ 2 ) − 2
2
2
2σ T (yt − α − βxt )2 .
t=1 ˆ
ˆ
Given the value of σ 2 , this is maximised by the values α and β under (42)
and (45) respectively, which minimise the error sum of squares.
The estimate of σ 2 is from the following ﬁrstorder condition:
(70) ∂ log L
T
1
=− 2 + 4
∂σ 2
2σ
2σ T (yt − α − βxt )2 = 0.
t=1 Multiplying throughout by 2σ 4 /T and rearranging the result, gives (71) 1
σ 2 (α, β ) =
T T (yt − α − βxt )2 =
t=1 23 1
T e2 .
t ...
View
Full
Document
 Spring '12
 D.S.G.Pollock
 Econometrics

Click to edit the document details