13pattern_qp_art

13pattern_qp_art - Pattern Classication and Quadratic...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon
Pattern Classification, and Quadratic Problems (Robert M. Freund) March 30, 2004 c ± 2004 Massachusetts Institute of Technology. 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
1 Overview Pattern Classification, Linear Classifiers, and Quadratic Optimization Constructing the Dual of CQP The Karush-Kuhn-Tucker Conditions for CQP Insights from Duality and the KKT Conditions Pattern Classification without strict Linear Separation 2 Pattern Classification, Linear Classifiers, and Quadratic Optimization 2.1 The Pattern Classification Problem We are given: points a 1 ,...,a k ∈± n that have property “P” points b 1 ,...,b m n that do not have property “P” would like to use these k + m points to develop a linear rule that can be used to predict whether or not other points x might or might not have property P. In particular, we seek a vector v and a scalar β for which: v T a i for all i =1 ,...,k v T b i for all i ,...,m will then use v, β to predict whether or not other points c have property P or not, using the rule: If v T c>β , then we declare that c has property If v T c<β , then we declare that c does not have property 2
Background image of page 2
We therefore seek v, β that defines the hyperplane H v,β := { x | v T x = β } for which: v T a i for all i =1 ,...,k v T b i for all i ,...,m This is illustrated in Figure 1. Figure 1: Illustration of the pattern classification problem. 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
± ² ± ² 2.2 The Maximal Separation Model We seek v, β that defines the hyperplane H v,β := { x | v T x = β } for which: v T a i for all i =1 ,...,k v T b i for all i ,...,m would like the hyperplane H v,β not only to separate the points with m different properties, but to be as far away from the points a 1 ,...,a k ,b 1 ,...,b as possible. It is easy to derive via elementary analysis that the distance from the hyperplane H v,β to any point a i is equal to v T a i β . ± v ± Similarly, the distance from the hyperplane H v,β to any point b i is equal to T b i β v . ± v ± If we normalize the vector v so that ± v ± , then the minimum distance from the hyperplane H v,β to any of the points a 1 k 1 m is then: T T b 1 m min v a 1 β,. ..,v T a k β, β v ,...,β v T b . therefore would like v and β to satisfy: • ± v ± =1, and min v T a 1 T a k v T b 1 v T b m is maximized. 4
Background image of page 4
This yields the following optimization model: PCP : maximize v,β,δ δ s.t. v T a i β δ, i =1 ,...,k T b i β v δ, i ,...,m ± v ± , v ∈² n Now notice that PCP is not a convex optimization problem, due to the presence of the constraint ± v ± = 1”. 2.3 Convex Reformulation of PCP To obtain a convex optimization problem equivalent to PCP, we perform the following transformation of variables: v β x = = . δ δ 1 Then notice that δ = ± v ± = ± x ± , and so maximizing δ is equivalent to ± x ± 1 maximizing ± x ± , which is equivalent to minimizing ± x ± . This yields the following reformulation of PCP: minimize x,α ± x ± s.t.
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 6
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/04/2011 for the course ESD 15.094 taught by Professor Jiesun during the Spring '04 term at MIT.

Page1 / 30

13pattern_qp_art - Pattern Classication and Quadratic...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online