Support Vector MachinesEE219: Large Scale Data MiningProfessor Roychowdhury
SummaryIReviewISVM basicsICalculate the marginIHard-margin SVMIDual problem and optimal solutionISoft-margin SVMIHinge lossIDual problem and optimal solutionINonlinearILifting a vectorIGram matrixIKernel
Review SVM: basicsSupport Vector Machine is a supervised learning model trained forclassification or regression tasks. When it is a binary classifier, it istrained to find a hyperplane such that the distance from it to thenearest datapoint on both side is maximized.xyIdistance betweenwTx-b= 1 andwTx-b=-1 is2wTwImaximize margin means minimize12wTwIwhen the slack variable is considered, the objective function tominimize will be12wTw+λn∑i=1i
Review SVM : calculate the marginyxl2:wTx+b= 1l1:wTx+b=-1A(c1w)B(c2w)IPoint A onl1and PointB onl2satisfy:wT(c1w) +b=-1 (1)wT(c2w) +b= 1(2)IThe distanceD(A,B) betweenpoint A,B is also the distancebetween linel1,l2:D(A,B) =c2w-c1w2= (c2-c1)w21=2wTww2=2w2I(1) - (2) to get 1
Hard-margin SVM: Dual problemAs stated in previous lecture, for the binary classification problem,whenNsamples are linear separable, it can be written asNconstraints in an optimization problem.yi=1ifxi∈C1-1ifxi∈C2For max margin classifier, it can be transformed into aminimization problem with cost function:12wTw. Then the wholeproblem can be solved through dual problem.Primal problemminimize:12wTws.t.yi(wTxi+b)≥1,i= 1, . . . ,NDual problemmaximize: -12αTQα+ 1Tαs.t.α≥0 andyTα= 0
Hard-margin SVM: maxizing the marginIthe Lagrange function for the primal problem can be writtenasL(w,b, α) =12wTw+N∑i=1αi(1-yi(wTxi+b))Iα∈RNis the Lagrange multiplier(αi≥0), we hope to