This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: EE 649 Pattern Recognition Spring 2008 Homework 2  Solutions 1. (a) The optimal threshold x * satisfies P ( Y = 1  X = x * ) = P ( Y = 2  X = x * ). Since the classes are equallylikely, this is equivalent to p ( x *  Y = 1) = p ( x *  Y = 2). By inspection of the Cauchy classconditional densities, it is clear that this will happen if and only if ( x * a 2 ) 2 = ( x * a 1 ) 2 x * = a 1 + a 2 2 Therefore, * = Z x * p ( x  Y = 2) P ( Y = 2) dx + Z x * p ( x  Y = 1) P ( Y = 1) dx = 2 1 2 Z x * p ( x  Y = 1) P ( Y = 1) dx (by symmetry) = Z a 1 + a 2 2 1 b 1 1 + ( x a 1 b ) 2 dx By making the substitution u = x a 1 b , we obtain * = 1 Z a 2 a 1 2 b 1 1 + u 2 du = 1 arctan  u  u u = a 2 a 1 2 b = 1 2 1 arctan a 2 a 1 2 b which is the required result. (b) We have * ( w ) = 1 2 1 arctan w 2 where, by definition, w = a 2 a 1 b > 0. The plot of this function can be seen in Figure 1. We can see that the Bayes error decays monotonically with increased standard separation between the classes, i.e. with larger values of a 2 a 1 b . For example, the Bayes error is halved (equal to 0.25) when w = 2, that is,  a 2 a 1  is equal to 2 b units ( b plays here a similar role to the standard deviation of Gaussian densities). (c) From Figure 1, we can see that the maximum value of * is 0.5, which occurs for w = 0, that is, a 1 = a 2 . This corresponds to the case where the classconditional densities are equal, so that there is maximal confusion between the classes. A Bayes error of 0.5 means that the best one can do is equivalent to flipping a coin. 1 Figure 1: Bayes error as a function of standard separation between classes in the Cauchy case. 2. From equation (65) in DHS, we have that x = 1 2 ( 1 + ) t ( 1 ) where t = 1 2 ln P ( Y = 1) P ( Y = 0) (1) Here, = ( 1 ) T  1 ( 1 ) is the Mahalanobis distance between the classes. If P ( Y = 1) = P ( Y = 0), then t = 0 and the decision hyperplane passes through the midpoint between 1 and . On the other hand, if P ( Y = 1) 6 = P ( Y = 0) (without loss of generality, let us assume P ( Y = 1) > P ( Y = 0), so that t > 0), then we can see that x moves along the line defined by 1 and , towards , according to the bias given by t .The critical point x = corresponds to t ( 1 ) = 1 2 ( 1 ) t = 1 2 For t > 2, the decision hyperplane will not pass between the means. From (1), we can see that this is equivalent to ln P ( Y = 1) P ( Y = 0) > 1 2 2 P ( Y = 1) P ( Y = 0) > e 1 2 2 The skewness of the situation is due to a large difference (large ratio) between the apriori probabilities....
View Full
Document
 Spring '08
 BragaNeto

Click to edit the document details