Course Hero Logo

Support Vector Machines and Kernels for Computational.PDF -...

Course Hero uses AI to attempt to automatically extract content from documents to surface to you and others so you can study better, e.g., in search results, to enrich docs, and more. This preview shows page 1 - 2 out of 10 pages.

EducationSupport Vector Machines and Kernels for ComputationalBiologyAsa Ben-Hur1., Cheng Soon Ong2,3.¤, So¨ren Sonnenburg4, Bernhard Scho¨lkopf3, Gunnar Ra¨tsch2*1 Department ofComputer Science,Colorado State University,Fort Collins,Colorado,United States of America,2 Friedrich Miescher Laboratory,Max Planck Society,Tu¨bingen,Germany,3 Max Planck Institute for BiologicalCybernetics,Tu¨bingen,Germany,4 Fraunhofer Institute FIRST,Berlin,GermanyIntroductionThe increasing wealth of biological datacoming from a large variety ofplatformsand the continued developmentof newhigh-throughputmethodsforprobingbiologicalsystemsrequire increasinglymore sophisticatedcomputationalap-proaches.Putting allthese data in sim-ple-to-usedatabasesisa first step;butrealizingthe full potentialof the datarequiresalgorithmsthat automaticallyextractregularitiesfrom the data,whichcan then lead to biologicalinsight.Many ofthe problems in computationalbiology are in the form of prediction: startingfrom predictionofagene’sstructure,prediction ofits function,interactions,androle in disease.Supportvectormachines(SVMs)and related kernelmethodsareextremely good at solving such problems [1–3]. SVMs are widely used in computationalbiology due to theirhigh accuracy,theirability to dealwith high-dimensionalandlarge datasets,and their flexibility in mod-eling diverse sources of data [2,4–6].Thesimplestform ofapredictionproblem is binary classification:trying todiscriminate between objectsthatbelongto one oftwo categories—positive (+1)ornegative (21). SVMs use two key conceptstosolve this problem:large marginseparation and kernelfunctions.The ideaof large margin separation can be moti-vated by classification ofpointsin twodimensions (see Figure 1). A simple way toclassify the points is to draw a straight lineand callpoints lying on one side positiveand on the other side negative.If the twosetsare wellseparated,one would intui-tively draw the separating line such that itis as far as possible away from the points inboth sets(seeFigures2and 3). Thisintuitive choice capturesthe idea oflargemargin separation,which ismathematicallyformulated in thesection Classificationwith Large Margin.Instead of the abstract idea of points inspace,one can think of our data points asrepresenting objects using a setof featuresderived from measurements performed oneach object.For instance,in the case ofFigures 1–5,there are two measurementsfor each object,depicted aspointsin atwo-dimensionalspace.For large marginseparation,it turns out that not the exactlocation butonly the relative position orsimilarityofthe pointsto each otherisimportant.In the simplestcase oflinearclassification,the similarity oftwo objectsiscomputed bythe dot-product(a.k.a.

Upload your study docs or become a

Course Hero member to access this document

Upload your study docs or become a

Course Hero member to access this document

End of preview. Want to read all 10 pages?

Upload your study docs or become a

Course Hero member to access this document

Term
Summer
Professor
N/A
Tags
The Land, Support vector machine, kernel trick, SVMs, Kernel methods

Newly uploaded documents

Show More

Newly uploaded documents

Show More

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture