Assignments for Linear Classifiers, and SVM Note: Exercises without "*" are required for everyone. You are welcome to do those with "*" if interested in them, however, they will not be taken into account when giving scores. 1. In the multicategory case, a
1. Give an example to show that the decision boundary resulting form the nearestneighbor rule is piecewise linear. 2. (6.5 in the text book) Consider the set of twodimensional vectors from two categories:
T T T T T T x1 = (1, 0) , x2 = (0,1) , x3 = (0, -1
1g12
March 13, 2010 3.
#X '
XI ; XP ; ; Xn X
4.
#CX PI=PVUN @H; IAI=P'VUN @; PA ; I < < CI; P > H=
f @xY ; P A a
I I x I I x p e C Pp e P P P
2 ( 2 )2 2 2
' 'q,O"
f @xA a
I jxj= e P
5.
#XI; XP; ; Xnd'cfw_yP3 q,O" #XI; XP; Xn5ge'
f @xY I ; P A a
(I
6.
I
1
2
g ( x ) = w x + wo
T
x1 x 2 x= xd
w1 w 2 w= wd
3
g ( x) = g1 ( x) - g 2 ( x)
g ( x) > 0, x 1 g ( x) < 0, x 2 g ( x) = 0, x . g ( x) = 0
4
x1 , x2H
w x1 + w0 = w x 2 + w0
T T
w ( x1 - x 2 ) = 0
T
5
w
w x = xp + r w
6
w ) + w0 g ( x) = w ( x
The Estimation of Density Functions
p ( x | wi ) p ( wi ) In pattern recognition applications, we rarely have the complete knowledge about the density functions How to estimate the density functions from samples or training data?
3 problems
How to est
1
2
3
4
g i ( x) = min x - m
l =1, 2 , ,li l i
g j ( x) = min g j ( x)
j =1, 2 , ,c
5
6
g i ( x) = min x - x , k = 1,2,
k k i
, Ni
,c
g j ( x) = min g i ( x), i = 1,2,
i
x j
7
8
P ( m
| x) = max[ P ( i | x)]
i
x
P (e | x) = 1 - P ( m | x)
P
= P (e
1
K_L Karhumen-Loeve T x = [ x(1), , x(n)] u i , i = 1,2,
,
x = ci u i
i =1
1 u uj = 0
T i
i= j i j
^ x = ci u i
i =1
2
d
^ ^ = E[( x - x) ( x - x)]
T
= E[( ci ui )
i = d +1
T
j = d +1
c u
j
j
]
= E[ c ]
i = d +1 2 i
u x = u c k u k = ci = x u i
T i k =
Bayes
1
Bayes
2
Bayes
3
-1.
P( i | x) = p( x | i ) P( i )
=
1
p( x |
j =1
2
j
) P( j )
=
2
P (B A )P ( A ) = P (B , A )
4
-1.
P ( | x ) > P ( | x ),
1 2
x
1
P ( | x ) < P ( | x )
1 2
x
2
(1) P ( | x ) = max P ( | x ), x (2 ) p ( x |
3
3.1
2 P (i ) p ( x | i ) 2.1 p ( x | i ) P (i ) p ( x | 1 ) p ( x | 2 ) 1 p ( x | i ) 2 P (i ) p ( x | i ) P (i ) p ( x | i ) P (i ) p ( x | i ) P (i ) 2 2 N N p ( x | i ) P (i ) p ( x | i ) P (i ) p ( x | i ) P (i ) 2 p ( x | i ) P (i ) -Parzen k N
Support Vector Machine
1
Outline
Linearly separable patterns Linearly non-separable patterns Nonlinear case Some examples
2
Linearly separable case
Optimal Separating hyperplane
3
Optimal Hyperplane
4
Linear classfication
Training sample set T = cfw_(xi
Gaussian Mixture Model and EM (Expectation Maximization) Algorithm
Changshui Zhang Dept. of Automation Tsinghua University [email protected]
Reference
Jeff A. Bilmes, A Gentle Tutorial of the Algorithm and its Application to Parameter Estimation f
1 . (3.10 in the textbook) Suppose X = cfw_x1 , x2 ,L , xN is (i.i.d.) sampled from N ( , ) : tell
2
whether the maximum-likelihood estimators of and 2 are biased. 2. (3.3 in the textbook) Suppose X = cfw_x1 , x2 ,L , xN is (i.i.d.) sampled from:
f ( x,
THU-70250043, Pattern Recognition (Spring 2015)
Homework: 1
Bayesian Methods
Lecturer: Changshui Zhang
Student: XXX
[email protected][email protected]
Probability
1. Conditional probability
1.1. Prove that P (A B C) = P (A|B, C)P (B|C)P (C