# lec2 - Least-Squares Estimation Recall that the projection...

This preview shows pages 1–4. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Least-Squares Estimation: Recall that the projection of y onto C ( X ), the set of all vectors of the form Xb for b ∈ R k +1 , yields the closest point in C ( X ) to y . That is, p ( y | C ( X )) yields the minimizer of Q ( β ) = k y- X β k 2 (the least squares criterion) This leads to the estimator ˆ β given by the solution of X T X β = X T y (the normal equations) or ˆ β = ( X T X )- 1 X T y . All of this has already been established back when we studied projections (see pp. 30–31). Alternatively, we could use calculus: To find a stationary point (maximum, minimum, or saddle point) of Q ( β ), we set the partial derivative of Q ( β ) equal to zero and solve: ∂ ∂ β Q ( β ) = ∂ ∂ β ( y- X β ) T ( y- X β ) = ∂ ∂ β ( y T y- 2 y T X β + β T ( X T X ) β ) =- 2 X T y + 2 X T X β Here we’ve used the vector differentiation formulas ∂ ∂ z c T z = c and ∂ ∂ z z T Az = 2 Az (see § 2.14 of our text). Setting this result equal to zero, we obtain the normal equations, which has solution ˆ β = ( X T X )- 1 X T y . That this is a minimum rather than a max, or saddle point can be verified by checking the second derivative matrix of Q ( β ): ∂ 2 Q ( β ) ∂ β = 2 X T X which is positive definite (result 7, p. 54), therefore ˆ β is a minimum. 101 Example — Simple Linear Regression Consider the case k = 1: y i = β + β 1 x i + e i , i = 1 , . . . , n where e 1 , . . . , e n are i.i.d. each with mean 0 and variance σ 2 . Then the model equation becomes y 1 y 2 . . . y n = 1 x 1 1 x 2 . . . . . . 1 x n | {z } = X β β 1 ¶ | {z } = β + e 1 e 2 . . . e n . It follows that X T X = n ∑ i x i ∑ i x i ∑ i x 2 i ¶ , X T y = ∑ i y i ∑ i x i y i ¶ ( X T X )- 1 = 1 n ∑ i x 2 i- ( ∑ i x i ) 2 ∑ i x 2 i- ∑ i x i- ∑ i x i n ¶ . Therefore, ˆ β = ( X T X )- 1 X T y yields ˆ β = ˆ β ˆ β 1 ¶ = 1 n ∑ i x 2 i- ( ∑ i x i ) 2 ( ∑ i x 2 i )( ∑ i y i )- ( ∑ i x i )( ∑ i x i y i )- ( ∑ i x i )( ∑ i y i ) + n ∑ i x i y i ¶ . After a bit of algebra, these estimators simplify to ˆ β 1 = ∑ i ( x i- ¯ x )( y i- ¯ y ) ∑ i ( x i- ¯ x ) 2 = S xy S xx and ˆ β = ¯ y- ˆ β 1 ¯ x 102 In the case that X is of full rank, ˆ β and ˆ μ are given by ˆ β = ( X T X )- 1 X T y , ˆ μ = X ˆ β = X ( X T X )- 1 X T y = P C ( X ) y . • Notice that both ˆ β and ˆ μ are linear functions of y . That is, in each case the estimator is given by some matrix times y . Note also that ˆ β = ( X T X )- 1 X T y = ( X T X )- 1 X T ( X β + e ) = β + ( X T X )- 1 X T e . From this representation several important properties of the least squares estimator ˆ β follow easily: 1. (unbiasedness): E( ˆ β ) = E( β + ( X T X )- 1 X T e ) = β + ( X T X )- 1 X T E( e ) | {z } = = β ....
View Full Document

## This note was uploaded on 11/13/2011 for the course STAT 8260 taught by Professor Hall during the Summer '10 term at UGA.

### Page1 / 100

lec2 - Least-Squares Estimation Recall that the projection...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online