{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

hw2sol - Stat 315A Homework 2 Solutions 1 Problem 1(a The...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Stat 315A Homework 2 Solutions November 13, 2008 1 Problem 1 (a) The classification rule for LDA is to classify as class 2 if δ 2 ( x ) > δ 1 ( x ). This gives x T ˆ Σ - 1 ˆ μ 2 - 1 2 ˆ μ 2 T ˆ Σ - 1 ˆ μ 2 + log( N 2 N ) > x T ˆ Σ - 1 ˆ μ 1 - 1 2 ˆ μ T 1 ˆ Σ - 1 ˆ μ 1 + log( N 1 N ) x T ˆ Σ - 1 μ 2 - ˆ μ 1 ) > 1 2 ˆ μ 2 T ˆ Σ - 1 ˆ μ 2 - 1 2 ˆ μ 1 T ˆ Σ - 1 ˆ μ 1 - log( N 2 N ) + log( N 1 N ) where ˆ μ k = g i = k x i /N k and ˆ Σ = K k =1 g i = k ( x i - ˆ μ k )( x i - ˆ μ k ) T / ( N - K ). (b) Recall from linear regression that ˆ β = ( X T X ) - 1 X T Y . Note here that both X and Y are centered. (( X - ¯ x ) T ( Y - ¯ y )) j = N X i =1 ( x ij - ¯ x j ) y i ¯ y = 0 . = X g i =1 ( x ij - ¯ x j )( - N N 1 ) + X g i =2 ( x ij - ¯ x j )( N N 2 ) = N μ 2 - ˆ μ 1 ) + N ¯ x - N ¯ x = N μ 2 - ˆ μ 1 ) . 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
( N - 2) ˆ Σ + N 1 N 2 N ˆ Σ B = X g i =1 ( x i - ˆ μ 1 ) x T i + N 2 N μ 1 - ˆ μ 2 ) x T i + X g i =2 ( x i - ˆ μ 2 ) x T i + N 1 N μ 1 - ˆ μ 2 ) x T i + N 1 N μ 2 - ˆ μ 1 ) x T i = X g i =1 ( x i - ¯ x ) x T i + X g i =2 ( x i - ¯ x ) x T i = N X i =1 ( x i - ¯ x )( x i - ¯ x ) T = ( X - ¯ x )( X - ¯ x ) T where ¯ x = 1 N ( N 1 ˆ μ 1 + N 2 ˆ μ 2 ). (c) Notice that (ˆ μ 2 - ˆ μ 1 ) T ˆ β is a scalar and we denote this by α . Then from part b, ( N - 2) ˆ Σ β = N μ 2 - ˆ μ 1 ) - N 1 N 2 N ˆ Σ B β = (ˆ μ 2 - ˆ μ 1 ) N - N 1 N 2 N α ˆ β ˆ Σ - 1 μ 2 - ˆ μ 1 ) . Thus, the regression coefficient is proportional to the LDA coefficient. (d) To show that this holds for any distinct coding of Y , we must show that X T Y μ 2 - ˆ μ 1 ). WLOG we can consider Y and X to be centered, then, assume Y is coded as - a/N 1 and a/N 2 . X T Y = N X i =1 x i y i = - a N 1 X g i =1 x i + a N 2 X g i =2 x ij = a μ 2 - ˆ μ 1 ) . Thus, ˆ β ˆ Σ - 1 μ 2 - ˆ μ 1 ). (e) From part d, ˆ β = γ ˆ Σ - 1 μ 2 - ˆ μ 1 ) for some γ , and β 0 = - ¯ x T ˆ β , then ˆ f = ˆ β 0 + ˆ β T X = - 1 N ( N 1 ˆ μ 1 + N 2 ˆ μ 2 ) T γ ˆ Σ - 1 μ 2 - ˆ μ 1 ) + γ μ 2 - ˆ μ 1 ) T ˆ Σ - 1 X Thus, we classify to class 2 if ˆ f > 0, μ 2 - ˆ μ 1 ) T ˆ Σ - 1 X > 1 N ( N 1 ˆ μ 1 + N 2 ˆ μ 2 ) T ˆ Σ - 1 μ 2 - ˆ μ 1 ) 2
Background image of page 2
But, this is not the same as the LDA rule (below) unless N 1 = N 2 . μ 2 - ˆ μ 1 ) t ˆ Σ - 1 X > N 1 N h ˆ μ 2 T ˆ Σ - 1 ˆ μ 2 - ˆ μ 1 T ˆ Σ - 1 ˆ μ 1 i . 2 Problem 2 (a) Note that we have an optimization problem of the form minimize β + - L ( β ) subject to X j | β j | - t 0 - β + j , - β - j 0 , for j = 1 . . . p. (1) Thus, we have 2 p + 1 inequality constraints. The Lagrangian dual function for (1) is L ( β ) + λ ( X ( β + j + β - j ) - t ) - X λ + j β + j - X λ - j β - j or equivalently because λt does not depend on β L ( β ) + λ X ( β + j + β - j ) - X λ + j β + j - X λ - j β - j (2) where ( λ + j - λ - j ) = λ . The four Karush-Kuhn Tucker optimality conditions are f i ( x ) 0 : X | β j | - t 0 , ⇒ - β + j , - β - j 0 , for j = 1 , . . . p. λ i 0 : λ 0 , λ + j , λ - j 0 , for j = 1 , . . . p.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}