CSC2515 Machine Learning
Sam Roweis
Applications
2
Machine Learning is most useful when the structure of the task is not
well understood but can be characterized by a dataset with strong statistical regularity. ML is also useful in adaptive or dyanmic sit
CSC2515 Machine Learning
Sam Roweis
Applications
2
Machine Learning is most useful when the structure of the task is not
well understood but can be characterized by a dataset with strong statistical regularity. ML is also useful in adaptive or dyanmic sit
CSC2515 Machine Learning
Typical Behaviour
Sam Roweis
high bias
low variance
low bias
high variance
error
Lecture 11:
2
test set
Overfitting and Capacity Control
training set
low
November 21, 2006
Generalization, Overfitting, Underfitting
1
Model Complexi
CSC2515 Machine Learning
Sam Roweis
Dimensionality Reduction vs. Clustering
2
Training such "factor models" is called dimensionality reduction. (examples: Factor Analysis, Principal/Independent Components) You can think of this as (non)linear regression
CSC2515 Machine Learning
Sam Roweis
Feature Spaces The extended representation is called a feature space. An algorithm that is linear in the feature space may be highly nonlinear in the original space if the features contain nonlinear mappings of the raw
CSC2515 Machine Learning
Sam Roweis
Markov Models Use past as state. Next output depends on previous output(s): yt = f [yt-1, yt-2, . . .] order is number of previous outputs
2
Lecture 10:
y1
y2
y3
y4
y5
yk
y0
Markov and Hidden Markov Models Add noise to
CSC2515 Machine Learning
Sam Roweis
No Free Lunch
2
Lecture 12:
David Wolpert and others have proven a series of theorems, known as the no free lunch theorems which, roughly speaking, say that unless you make some assumptions about the nature of the func
CSC2515 Machine Learning
Sam Roweis
Partially Unobserved Variables Certain variables q in our models may be unobserved, either at training time or at test time or both.
2
If the are occasionally unobserved they are missing data. e.g. undefinied inputs, m
CSC2515 Machine Learning
Sam Roweis
Probabilistic Classification: Bayes Classifiers
2
Lecture 3:
Classification II
September 26, 2006
Generative model: p(x, y) = p(y)p(x|y). p(y) are called class priors. p(x|y) are called class-conditional feature distri
Artificial Neural Networks CSC2515 Machine Learning Sam Roweis We saw neural nets for classication. Same idea for regression. ANNs are just adaptive basis regression machines of the form: yk = Lecture 5:
j
2
wkj (bx) = wk h j
where hj = (bj x) are known a
CSC2515 Machine Learning
Sam Roweis
Review: Maximum Likelihood
2
Lecture 5:
Maximum likelihood asks the question: for which setting of the parameters is the data we saw most likely? To answer this, it assumes that the training data are iid, computes the
CSC2515 Machine Learning
Sam Roweis
Error Function is Crucial (eg Constant Model) Constant model says y = a, independent of x. (What is the constant model in classication?) Q: What should we use for a? The mean? The median? The mode (for quantized data)?
CSC2515 Machine Learning
Sam Roweis
Three Unsupervised Models The three canonical problems in unsupervised learning are clustering, dimensionality reduction, and density modeling:
2
Lecture 7:
Clustering and Tree Models
Clustering: grouping similar train
Homework 2
Special Topics in Advanced Machine Learning
Spring 2017
Instructor: Anna Choromanska
Homework is due 02/28/2017.
Problem 1 (20 points): Perceptron
Implement the linear perceptron using stochastic gradient descent (SGD) or
gradient descent (GD).