CMPS 142 Syllabus, Winter 10
Here is the syllabus for the Data Mining and Machine Learning class. We will be learning
about and experimenting with methods for learning from data. The text is Machine Learning
An Algorithmic Perspective by Marsland. You mig

Dimensions of a Supervised
Learner
1. Model (hypothesis
class) :
2. Loss function:
g (x | !)
E (! |X ) = " L r , g ( x |! )
t
3. Optimization
procedure:
(
Bayesian Learning
t
t
)
(modified from Yahoo class)
! * " arg min E (! |X )
!
1
2
Probability Review

Neural Networks
Feed forward networks of (usually) sigmoid
functions (continuous approximation of LTUs)
Powerful models - can represent any boolean
function (with exponentially many hidden
nodes)
Learn by gradient descent - many local
minima
Not like

Decision Trees - play golf?
Outlook?
Decision Trees
sunny
cloudy
Temp.
Yes
rain
No
<95 >95
Yes
No
1
Decision trees
2
Finding a good tree
Popular C4.5 excellent off the shelf alg.
Efficient hypothesis space
Variable sized: bigger trees for more complica

Bias and Variance
Unknown parameter
Estimate d = d (X) on sample X
Boosting
Bias: b(d) = E [d]
Variance: E [(dE [d])2]
Mean square error:
SqErr (d, ) = E [(d)2]
= (E [d] )2 + E [(dE [d])2]
= Bias2 + Variance
Copyright 2005 by David Helmbold
1
In regress

On-line Algorithms
A Santa Cruz specialty
For more info, see Avrim Blums Online algorithms in Machine Learning
and Manfred Warmuths web page,
Rich and interesting Theory
Learn as you go - lifelong learning
No training examples, all testing
On-line Al

Linear Threshold
Classification
Perceptron
and
Logistic Regression
Assume hyperplane divides + and - pts
Hyperplane has formula w x - b = 0
Predict + if w x - b > 0, otherwise (1 if w x - b > 0, otherwise 0 in book)
Linear Threshold Algorithms
Copyrigh

Clustering
Clustering is unsupervised learning,
there are no class labels
Want to find groups of similar
instances
Often use a distance measure (usually
Euclidean distance) for dis-similarity
Can use cluster membership/distances
as additional (created

First Name:
Last Name:
Sample Machine Learning Questions, 2010
Note: This sample exam is to indicate the style of questions you should expect. Check the
topic list for the particular topics to be covered on your exam. The actual exam will have
space for y

SVMs
Combine learning and optimization
theory
Exactly solve optimization (as opposed
to ANN or Decision Trees)
Allow use of kernel trick to get more
features almost for free
Support Vector Machines
(SVMs)
References:
Cristianini & Shawe-Taylor book; Va

CMPS 142 Midterm Topics (Winter 2010)
The midterm will be in class Thursday Feb. 25. The exam will be closed-book, although
students may have one 3x5 card of handwritten notes (both sides). The exam will have
space for your answers, so no blue books will

(Binary) Classification: Learning
a Class from labeled Examples
Lecture Slides for
INTRODUCTION TO
CHAPTER 2:
Machine Learning
Supervised Learning
Things represented by a feature vector x and a
label r (also called y), often r in cfw_1,0 or cfw_+,-
Domain

CMPS 142 Project Report Guidelines, Winter 2010
This document describes the project proposal, the progress report, and the nal project writeup.
I am also thinking about having students give a short (10-15 minute) presentation to the class,
perhaps during

VC dimension, Winter 2010
Recall that a hypotheses is a mapping from the domain X to two values (like +, or 0,1). Each
hypothesis in the class can be viewed as a subset of X (the points that it maps to + or 1). A
hypothesis class is a set of hypotheses.
A

CMPS 142 Second Homework, Winter 2010
5+ Problems, 10 pts, due start of class Thursday, Jan. 28
Each student should submit a homework and carefully acknowledge all sources of inspiration, techniques, and/or helpful ideas (web, people, books, etc.) other t

CMPS 142 Third Homework, Winter 2010
3 Problems, 12 pts, due start of class Thursday, Feb. 4
Each student should submit a homework and carefully acknowledge all sources of inspiration, techniques, and/or helpful ideas (web, people, books, etc.) other than

Backpropagation
Winter 2007
This is a document describing the backpropagation algorithm for Neural Networks.
1
Notation
A neural network is an acyclic directed graph of nodes. Each node is numbered and produces an
associated value, the value produced by n

Bayes for Regression
Winter 2010
This is a document about the bias-variance tradeo and a maximum likelihood explanation for
least-squares regression.
1
Least Squares as maximum likelihood
Assume we have a set of n training points X = cfw_(x1 , y1 ), (xn ,

CMPS 142 First Homework, Winter 2010
3 Problems, 10 pts, due start of class Thursday, Jan. 21
Each student should submit a homework and carefully acknowledge all sources of inspiration, techniques, and/or helpful ideas (web, people, books, etc.) other tha

CMPS 142 Fourth Homework, Winter 2010
3 Problems, 12 pts, due start of class Tuesday, Feb. 15
Each student should submit a homework and carefully acknowledge all sources of inspiration, techniques, and/or helpful ideas (web, people, books, etc.) other tha