PL503 // Green
Lecture 3
The basic linear model consists of a dependent variable (Y, an n x 1 vector), independent
variables X (an n x k matrix, the first column of which is a vector of ones), B a (k x 1)
vector of coefficients, and U (an n x 1 vector of disturbances).
Y=XB+U
The least squares estimator provides an estimate of B, called b.
The “residual” is e = YXb.
Don’t confuse residuals and disturbances.
They are the
same only when B=b.
How are the coefficients selected?
The criterion considered here is least squares; the
estimator selects the b that minimizes the sum of squared residuals.
This process may be
visualized at:
http://www.duxbury.com/authors/mcclellandg/tiein/johnson/reg.htm
or at
http://www.stattucino.com/berrie/dsl/regression/regression.html
Formally, least squares may be written:
Minimize e’e = (yXb)’(yXb) = y’y – b’X’y – y’Xb + b’X’Xb
Notice that b’X’y is equal to y’Xb (why?).
So we can simply the objective function:
Minimize e’e = (yXb)’(yXb) = y’y – 2y’Xb + b’X’Xb
The necessary condition for a minimum is that the derivative of the objective function
with respect to b is equal to zero.
Intuitively, the slope at the pit of a valley is zero.
To
take a derivative with respect to b, move the exponent out front in the form of a
coefficient and reduce the exponent by one.
So the derivative of b would be 1; the
derivative of b’b is 2b; the derivative of b’X’Xb is 2X’Xb.
If there are no b terms, the
derivative is zero (e.g., the derivative of y’y is zero since b nowhere appears).
Using these rules,
0
'
2
'
2
'
=
+

=
∂
∂
Xb
X
y
X
b
e
e
We need to pick a b so that this equation is satisfied.
The equation above implies
following equality, which is called the “least squares normal equations.”
X’Xb = X’y
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
So long as the inverse of X’X exists, we can solve easily for b by premultiplying both
sides by the inverse of X’X:
b = (X’X)
1
X’y.
That was pretty easy.
It’s important to reflect on the question of when this inverse exists.
Answer: when the X matrix has full rank, the inverse will exist.
So least squares is
inestimable when two columns of X are collinear or when there are more columns than
rows.
Technical note: to show that you’ve minimized the sum of squares, not only must
the derivative be zero, but the second derivative must be positive definite.
See Greene
p.21.
Further Implications of the Normal Equations
X’Xb  X’y = X’(yXb) = X’e = 0
(1) LS residuals sum to zero.
This follows from the fact that the first column of X is
a vector of ones.
Yet this vector times e is zero.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 Stengos
 Econometrics, Regression Analysis, 56.9%, 54.8%, 6460 43.1%, 7112 47.4%, 7888 52.6%

Click to edit the document details