Homework 4
Problem 1.
Z
p(D|M) =
p(D|w)p(w)dw p(D|wM AP )
wposterior
wprior
By
~
p(t|x, w) = N (t|wT (x),
1 ),
we have
~ i ), 1 ).
p(ti |xi , w) = N (ti |wT (x
Then,
p(t|w) =
N
X
i=1
p(ti |xi , w) =
N
X
~ i ), 1 ) = N (t|w, 1 I)
N (ti |wT (x
i=1
where
~
Homework 2
Problem 1. Since
t = y(x, w) +
and
p(|) =
then we have
p(t|x, w, ) =
exp(|),
2
exp(|t y(x, w)|).
2
Given observed inputs, X = cfw_x1 , ., xN , and targets, t = [t1 , ., tN ]T , we obtain
the likelihood function
N
Y
exp(|tn wT (xn )|).
p(t|X, w
Homework 1
Problem 1. Let D be the event of having the disease, N be the event of not
having the disease, + be the event of the test result is positive, be the event of
the test result is negative and P be the probability of each event.
From the problem w
Homework 6
Due: 03/03/2015 (before class)
February 23, 2015
Problem 1 (20 pt) Regularized Logistic Regression
Let D = cfw_(x1 , y1 ), . . . , (xn , yn ) be the training examples, where xi Rd and yi cfw_1, +1. The negative
log-likelihood function of the re
CSE847: Middle Term Exam
February 27, 2015
In all the problems of this exam, we denote by D m {(x1,t1), (X2,t2}, . . . , (xN,tN)} the training examples,
where x,- r: (mmﬁnm, . . . , mm) 6 Rd is a vector of d dimension. 1;, is a continuous variable when it
Homework 3
Problem 1. Using the lasso regularization, the weights w for each is given in
Figure 1 and the test error is given in Figure 2.
Figure 1: Weights for each using lasso regularization
From Figure 1, we can see that as increases, the weights becom
Homework 3
Due: 02/10/2015 (before class)
February 1, 2015
Problem 1 (20pt): Experiment with Lasso Regularization
Data set A data set is provided in the file diabetes.mat that can be downloaded from http:/www.cse.
msu.edu/~cse847/assignments/diabetes.mat.
Online Learning
Rong Jin
Batch Learning
Given a collection of training examples D
Learning a classification model from D
What if training examples are received one
at each time ?
Online Learning
For t=1, 2, T
Receive an instance
Predict its class lab
Machine Learning
Spring 2013
Rong Jin
1
CSE847 Machine Learning
Instructor: Rong Jin
Office Hour:
Textbook
Tuesday 4:00pm-5:00pm
TA, Qiaozi Gao, Thursday 4:00pm-5:00pm
Machine Learning
The Elements of Statistical Learning
Pattern Recognition and Machine L
Middle Term Exam
03/04, in class
Project
It is a team work
No more than 2 people for each team
Define a project of your own
Otherwise, I will assign you to a tough project
Important date
03/23: project proposal
04/27 and 04/29: presentation
05/02: f
Overview of Clustering
Rong Jin
Outline
K means for clustering
Expectation Maximization algorithm for clustering
Spectrum clustering (if time is permitted)
Clustering
Find out the underlying structure for given data
points
$
age
Application (I): Search Re
Homework 7
Due 03/31/2015 (before class)
March 23, 2015
Problem 1 (20pt): Train and Test Support Vector Machine
Download the SVM software from the website http:/svmlight.joachims.org/. Read the documentation
and the example that is provided in the webpage
Data Classification
Rong Jin
Classification Problems
Given input:
Predict the output (class label)
Binary classification:
Multi-class classification:
Learn a classification function:
Regression:
Examples of Classification Problem
Text categorization
Homework 5
Problem 1. The classification accuracy over the test documents that I got is
0.8068 when = 0.1. Here is the code:
function [accuracy] = hw5_1()
train = dlmread(data/train.data);
trainLabel = dlmread(data/train.label);
test = dlmread(data/test.d
Homework 9
Due: April 23, 2015 (before class)
April 16, 2015
Problem 1: Hidden Markov Model (20pt)
We denote by = hN, M, , a, bi the Hidden Markov Model, where
N : the number of states
M : the number of possible observations (or tokens)
pi = (1 , 2 , .
Bayesian Learning
Rong Jin
Outline
r r
w w r
w
MAP learning vs. ML learning
Minimum description length principle
Bayes optimal classifier
Bagging
Maximum Likelihood Learning (ML)
Find the best model by maximizing the loglikelihood of the training data
Expectation Maximization Algorithm
Rong Jin
A Mixture Model Problem
20
18
16
14
12
10
8
6
4
2
0
0
5
10
15
20
25
Apparently, the dataset consists of two modes
How can we automatically identify the two modes?
Gaussian Mixture Model (GMM)
Assume that the dat
Introduction to Probability
Theory
Rong Jin
Outline
Basic concepts in probability theory
Bayes rule
Random variable and distributions
Definition of Probability
Experiment: toss a coin twice
Sample space: possible outcomes of an experiment
Event: a subset
Boosting
Rong Jin
Inefficiency with Bagging
Bagging
D
Inefficient boostrap sampling:
Every example has equal chance to be
sampled
No distinction between easy
examples and difficult examples
Boostrap Sampling
D1
D2
Dk
Inefficient model combination:
A co
Logistic Regression
Rong Jin
Logistic Regression
Generative models often lead to linear
decision boundary
Linear discriminatory model
Directly model the linear decision boundary
w is the parameter to be decided
Logistic Regression
Logistic Regression
Homework II
Due: 01/29/2015 (before class)
January 22, 2015
Problem 1 (10pt): Noise Model
In the class, we assume the following data generative model
t = y(x, w) +
where N (|0,
distribution, i.e.,
1
. We now modify the data generative model by assuming t
Homework II
Due: 01/29/2015 (before class)
January 22, 2015
Problem 1 (10pt): Noise Model
In the class, we assume the following data generative model
t = y(x; w) +
where N ( |0;
distribution, i.e.,
1
. We now modify the data generative model by assuming t
Homework 3
Due: 02/10/2015 (before class)
February 1, 2015
Problem 1 (20pt): Experiment with Lasso Regularization
Data set A data set is provided in the file diabetes.mat that can be downloaded from http:/www.cse.
msu.edu/~cse847/assignments/diabetes.mat.
Information Filtering
Rong Jin
1
Outline
Brief introduction to information filtering
Collaborative filtering
Adaptive filtering
2
Short vs. Long Term Info. Need
Short-term information need (Ad hoc retrieval)
Temporary need, e.g., info about used cars
Info
Semi-supervised Learning
Rong Jin
Spectrum of Learning Problems
What is Semi-supervised Learning
Learning from a mixture of labeled and unlabeled examples
Labeled Data
Unlabeled Data
L = f (xl1; y1); : : : ; (xln l ; yn l )g
Total number of examples:
N =
Homework 4
Due: 02/17/2015 (before class)
February 9, 2015
Problem 1 (20pt): Bayesian model selection
In this homework, you are asked to compute the result of Bayesian model selection for linear regression model.
Let M be a family of linear regression mod
Homework 7
Due 03/31/2015 (before class)
March 23, 2015
Problem 1 (20pt): Train and Test Support Vector Machine
Download the SVM software from the website http:/svmlight.joachims.org/. Read the documentation
and the example that is provided in the webpage
Homework 8
Due: April 9, 2015 (before class)
March 31, 2015
Problem 1 (15pt) Hedge Algorithm
In class, we discussed the Hedge algorithm, which learns positive weights to combine the predictions from multiple
experts/classifiers. In this problem, you are a