lecture8 - CSE 6740 Lecture 8 How Do I Predict a Discrete...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
CSE 6740 Lecture 8 How Do I Predict a Discrete Variable? II (Classification) Alexander Gray agray@cc.gatech.edu Georgia Institute of Technology CSE 6740 Lecture 8 – p. 1/2 4
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Today 1. More classification methods (How can I predict a discrete variable?) CSE 6740 Lecture 8 – p. 2/2 4
Background image of page 2
Support Vector Machine Now let’s choose a different criterion. Let’s find the hyperplane which maximizes the distance of the closest point from either class. We call this distance the margin . Points on the margin are called support vectors . Let’s begin by assuming the classes are linearly separable. CSE 6740 Lecture 8 – p. 3/2 4
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Support Vector Machine The hyperplane which maximizes the margin is given by finding max β 0 m subject to 1 || β || y i ( β 0 + β T x i ) m, i. (1) Equivalently the constraints can be written as y i ( β 0 + β T x i ) m || β || . Since for any β 0 and β satisfying these inequalities, any positively scaled multiple satisfies them too, we can arbitrarily set || β || = 1 /m . CSE 6740 Lecture 8 – p. 4/2 4
Background image of page 4
Support Vector Machine Thus the optimization problem is equivalent to minimizing 1 2 || β || subject to y i ( β 0 + β T x i ) 1 , i. (2) It turns out this optimization problem is a quadratic programming problem (quadratic objective function with linear constraints), a standard type of optimization problem for which methods exist for finding the global optimum. The theory of convex optimization tells us there is an equivalent way to write this optimization problem (its dual formulation ). CSE 6740 Lecture 8 – p. 5/2 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Support Vector Machine Let g ( x ) denote the optimal (maximum margin) hyperplane. Let a x i x i A denote the inner product of x i and x i . Then β j = N s i =1 α i y i x ij (3) where α is the vector of weights that maximizes N s i =1 α i 1 2 N s i =1 N s i =1 α i α i y i y i a x i x i A (4) subject to α i 0 and s i α i y i = 0 . (5) CSE 6740 Lecture 8 – p. 6/2 4
Background image of page 6
Support Vector Machine However, for realistic problems we must relax the assumption that the classes are linearly separable. In the primal formulation, instead of minimizing 1 2
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 24

lecture8 - CSE 6740 Lecture 8 How Do I Predict a Discrete...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online