10/14/13
Kernels
Definition: A function k(x, z) that can be expressed as a dot
product in some feature space is called a kernel.
More on kernel functions
In other words, k(x, z) is a kernel if there exists
such that
k ( x , z) = ( x ) | ( z)
: X 7! F
1
Ex
9/30/13
Loss and regularization
Recall that ridge regression can be expressed in the form
SVMs for unbalanced data
minimize
w,b
Chapter 7
n
X
i=1
L(yi , f (xi ) + |w|2
L(y, f) is the loss function: it assigns a penalty for the
discrepancy between label an
10/2/13
A 1-dimensional problem
SVMs and kernel machines: from linear
to non-linear classifiers
What would an SVM do
with this data?
Chapter 7
margin
x=0
1
A 1-dimensional problem
A harder dataset
What now?
x=0
Positive plane
x=0
Negative plane
1
10/2/13
9/24/13
Large margin classifiers
Large margin classifiers: Support Vector
Machines
Perceptron: find hyperplane that separates the two classes
Support Vector Machine (SVM): separating hyperplane with a
large margin
margin
Intuitive concept that is backed b
9/20/13
Linear regression
We would like a more informed way of choosing the weight
vector than the perceptron algorithm.
Linear models: Linear regression
Goal: the predicted values be as close as possible to the labels.
Express this using the following co
9/3/13
Preliminaries
Linear models: the perceptron and
closest centroid algorithms
Definition: The Euclidean dot product between two vectors is
d
the expression
X
wT x =
w i xi
i=1
The dot product is also referred to as inner product or scalar
product.
Ch
9/4/13
Using classifiers
At this point we have learned two classification algorithms:
Evaluating and using ML classifiers
q
The closest centroid classifier
q
The perceptron algorithm
+
+
Chapter 2
+
+
p
How do we measure how well they
+
w=pn
(p+n)/2
+
+
+
10/7/13
Kernels
Definition: A function k(x, z) that can be expressed as a dot
product in some feature space is called a kernel.
Kernel-based learning algorithms
In other words, there exists
such that
Chapter 7
: X 7! F
k ( x , z) = ( x ) | ( z)
1
2
Kernel
10/7/13
Handling more than two classes
v
Beyond Binary Classification
v
Chapter 3
v
v
Some classifiers methods can only be used for binary
classification
Can we somehow use a binary classifier to do multi-class
classification?
How to evaluate multi-class
10/29/13
Probability theory
A crash course in probability and Nave
Bayes classification
Random variable: a variable whose possible values are numerical
outcomes of a random phenomenon.
Examples: A persons height, the outcome of a coin toss
Chapter 9
Disti
Kernel methods for predicting
protein-protein interactions
Asa Ben-Hur
Department of Computer Science
Colorado State University
Kernel methods for predicting protein-protein interactions p.1/33
Background: from DNA to Protein
Kernel methods for predicting
10/21/13
Clustering
Distance based clustering
Clustering is the art of finding groups in data (Kaufman and
Rousseeuw, 1990).
What is a cluster?
Group
Chapter 8
of objects separated from other clusters
16
14
16
12
14
12
10
10
8
8
6
6
4
4
2
0
2
4
6
8
10
12
10/21/13
Measuring distance
How to measure closeness?
Distance based models
Distance measures for continuous data:
The Euclidean distance:
v
ud
uX
Dis (x, y) = |x y| = t (x
Chapter 8
2
2
i
yi ) 2 =
i=1
p
(x
y )| (x
y)
(based on the 2-norm)
1
Distance base
10/10/13
Evaluating classifier performance
Evaluating and using ML classifiers:
model selection
The simplest evaluation protocol:
q
Train a classifier on the training set
q
Chapter 2
Divide your labeled data into a training set and test set.
q
Classify th
8/29/13
Machine learning and related fields
Machine learning: the construction and study of
systems that learn from data.
CS545 Machine Learning
Pattern recognition: the same field, different
practitioners
Course Introduction
Data mining: using algorithms