Suppose x 1 2 and y 2 3 4 c x xq copyright c

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ) Xq Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 19 / 27 Why is PX called an orthogonal projection matrix? Suppose X = 1 2 and y = 2 3 4 . C (X ) Xq Copyright c 2012 Dan Nettleton (Iowa State University) q y Statistics 511 20 / 27 Why is PX called an orthogonal projection matrix? Suppose X = 1 2 and y = 2 3 4 . C (X ) Xq ^ yq Copyright c 2012 Dan Nettleton (Iowa State University) q y Statistics 511 21 / 27 Why is PX called an orthogonal projection matrix? Suppose X = 1 2 and y = 2 3 4 . C (X ) Xq ^ yq q q y ^ y−y Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 22 / 27 Why is PX called an orthogonal projection matrix? The angle between ˆ and y ˆ is 90◦ . y y The vectors ˆ and y − ˆ are orthogonal. y y ˆ (y − ˆ) = ˆ (y − PX y) = ˆ (I − PX )y y y y y = (PX y) (I − PX )y = y PX (I − PX )y = = y P X (I − P X )y = y (P X − P X P X )y = y (PX − PX )y = 0. opyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 23 / 27 Optimality of ˆ as an Estimator of E(y) y ˆ is an unbiased estimator of E(y): y ˆ E(y) = E(PX y) = PX E(y) = PX Xβ = Xβ = E(y). It can be shown that ˆ = PX y is the best estimator of E(y) in the y class of linear unbiased estimators, i.e., estimators of the form My for M satisfying E(My) = E(y) ∀ β ∈ IRp ⇐⇒ MXβ = Xβ ∀ β ∈ IRp ⇐⇒ MX = X. Under the Gauss-Markov Linear Model, ˆ = PX y is best among all y unbiased estimators of E(y). opyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 24 / 27 Ordinary Least Squares (OLS) Estimation of E(y) OLS: Find a vector b∗ ∈ IRp such that n Q(b∗ ) ≤ Q(b) ∀ b ∈ IRp , where Q(b) ≡ (yi − x(i) b)2 . i=1 Note that n (yi − x(i) b)2 = (y − Xb) (y − Xb) = ||y − Xb||2 . Q(b) = i=1 To minimize this sum of squares, we need to choose b∗ ∈ IRp such Xb∗ will be the point in C (X) that is closest to y. In other words, we need to choose b∗ such that Xb∗ = PX y = X(X X)− X y. Clearly, choosing b∗ = (X X)− X y will work. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 25 / 27 Ordinary Least Squares and the Normal Equations It can be shown that Q(b∗ ) ≤ Q(b) ∀ b ∈ IRp if...
View Full Document

This document was uploaded on 03/27/2014.

Ask a homework question - tutors are online