{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

hw2sol - Stat 315A Homework 2 Solutions 1 Problem 1(a The...

This preview shows pages 1–4. Sign up to view the full content.

Stat 315A Homework 2 Solutions November 13, 2008 1 Problem 1 (a) The classification rule for LDA is to classify as class 2 if δ 2 ( x ) > δ 1 ( x ). This gives x T ˆ Σ - 1 ˆ μ 2 - 1 2 ˆ μ 2 T ˆ Σ - 1 ˆ μ 2 + log( N 2 N ) > x T ˆ Σ - 1 ˆ μ 1 - 1 2 ˆ μ T 1 ˆ Σ - 1 ˆ μ 1 + log( N 1 N ) x T ˆ Σ - 1 μ 2 - ˆ μ 1 ) > 1 2 ˆ μ 2 T ˆ Σ - 1 ˆ μ 2 - 1 2 ˆ μ 1 T ˆ Σ - 1 ˆ μ 1 - log( N 2 N ) + log( N 1 N ) where ˆ μ k = g i = k x i /N k and ˆ Σ = K k =1 g i = k ( x i - ˆ μ k )( x i - ˆ μ k ) T / ( N - K ). (b) Recall from linear regression that ˆ β = ( X T X ) - 1 X T Y . Note here that both X and Y are centered. (( X - ¯ x ) T ( Y - ¯ y )) j = N X i =1 ( x ij - ¯ x j ) y i ¯ y = 0 . = X g i =1 ( x ij - ¯ x j )( - N N 1 ) + X g i =2 ( x ij - ¯ x j )( N N 2 ) = N μ 2 - ˆ μ 1 ) + N ¯ x - N ¯ x = N μ 2 - ˆ μ 1 ) . 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
( N - 2) ˆ Σ + N 1 N 2 N ˆ Σ B = X g i =1 ( x i - ˆ μ 1 ) x T i + N 2 N μ 1 - ˆ μ 2 ) x T i + X g i =2 ( x i - ˆ μ 2 ) x T i + N 1 N μ 1 - ˆ μ 2 ) x T i + N 1 N μ 2 - ˆ μ 1 ) x T i = X g i =1 ( x i - ¯ x ) x T i + X g i =2 ( x i - ¯ x ) x T i = N X i =1 ( x i - ¯ x )( x i - ¯ x ) T = ( X - ¯ x )( X - ¯ x ) T where ¯ x = 1 N ( N 1 ˆ μ 1 + N 2 ˆ μ 2 ). (c) Notice that (ˆ μ 2 - ˆ μ 1 ) T ˆ β is a scalar and we denote this by α . Then from part b, ( N - 2) ˆ Σ β = N μ 2 - ˆ μ 1 ) - N 1 N 2 N ˆ Σ B β = (ˆ μ 2 - ˆ μ 1 ) N - N 1 N 2 N α ˆ β ˆ Σ - 1 μ 2 - ˆ μ 1 ) . Thus, the regression coefficient is proportional to the LDA coefficient. (d) To show that this holds for any distinct coding of Y , we must show that X T Y μ 2 - ˆ μ 1 ). WLOG we can consider Y and X to be centered, then, assume Y is coded as - a/N 1 and a/N 2 . X T Y = N X i =1 x i y i = - a N 1 X g i =1 x i + a N 2 X g i =2 x ij = a μ 2 - ˆ μ 1 ) . Thus, ˆ β ˆ Σ - 1 μ 2 - ˆ μ 1 ). (e) From part d, ˆ β = γ ˆ Σ - 1 μ 2 - ˆ μ 1 ) for some γ , and β 0 = - ¯ x T ˆ β , then ˆ f = ˆ β 0 + ˆ β T X = - 1 N ( N 1 ˆ μ 1 + N 2 ˆ μ 2 ) T γ ˆ Σ - 1 μ 2 - ˆ μ 1 ) + γ μ 2 - ˆ μ 1 ) T ˆ Σ - 1 X Thus, we classify to class 2 if ˆ f > 0, μ 2 - ˆ μ 1 ) T ˆ Σ - 1 X > 1 N ( N 1 ˆ μ 1 + N 2 ˆ μ 2 ) T ˆ Σ - 1 μ 2 - ˆ μ 1 ) 2
But, this is not the same as the LDA rule (below) unless N 1 = N 2 . μ 2 - ˆ μ 1 ) t ˆ Σ - 1 X > N 1 N h ˆ μ 2 T ˆ Σ - 1 ˆ μ 2 - ˆ μ 1 T ˆ Σ - 1 ˆ μ 1 i . 2 Problem 2 (a) Note that we have an optimization problem of the form minimize β + - L ( β ) subject to X j | β j | - t 0 - β + j , - β - j 0 , for j = 1 . . . p. (1) Thus, we have 2 p + 1 inequality constraints. The Lagrangian dual function for (1) is L ( β ) + λ ( X ( β + j + β - j ) - t ) - X λ + j β + j - X λ - j β - j or equivalently because λt does not depend on β L ( β ) + λ X ( β + j + β - j ) - X λ + j β + j - X λ - j β - j (2) where ( λ + j - λ - j ) = λ . The four Karush-Kuhn Tucker optimality conditions are f i ( x ) 0 : X | β j | - t 0 , ⇒ - β + j , - β - j 0 , for j = 1 , . . . p. λ i 0 : λ 0 , λ + j , λ - j 0 , for j = 1 , . . . p.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}