CS189: Introduction to Machine Learning
Homework 6
Due: 11:59 pm on Tuesday, November 24, 2015
Homework Party: 16:00-18:00 November 19, 2015, Wozniak Lounge
Submission: bCourses and Kaggle
No use of external neural network libraries is allowed, e.g. Thean

CS189/CS289A
Introduction to Machine Learning
Lecture 11: Logistic Regression
Peter Bartlett
February 24, 2015
1 / 34
Outline
First: A Bayesian view of linear regression
Logistic Regression:
Gaussian generative to logistic discriminative models.
Parameter

Reducing Computational Cost
Nearest-neighbors has O(N) complexity
Infeasible for large datasets
Can we speed it up?
Think of guessing a number between 1 and 10
K-d tree
K-d tree is a binary tree data structure for organizing a set of points in
a K-di

Entropy: Average Surprise
Entropy H = pi log 2 pi
i
pi is the probability of class i
Compute it as the proportion of class i in the set.
in 2-Class case:
What is the entropy of a group in which
all examples belong to the same
class?
Minimum
entropy
ent

Andrew Ng
Andrew Ng
Andrew Ng
Training a neural network
Andrew Ng
Learning by perturbing weights
(this idea occurs to everyone who knows about evolution)
Randomly perturb one weight and see if
it improves performance. If so, save the
change.
This is a f

Unsupervised Learning
Where are we?
Parametric vs. non-parametric
Generative vs. Discriminative
Supervised vs Unsupervised
Supervised:
Given X, predict Y
Unsupervised:
Given X do something interesting
assumes that X has some structure
Discovering S

Non-Parametric Methods
CS 189
Alexei Efros
Nearest-Neighbor Rule
Also known as:
instance-based learning
memory-based learning
exemplar-based learning
lazy learning
What do you do at training time?
Nothing! The only O(0) algorithm in this class!
A type

Wisdom of Crowds (Francis Galton)
Many idiots (weak learners) are often better
than one expert
Combination of Several decision stumps
Ensemble Methods
Instead of learning one model, learn several and
combine, e.g.
Averaging
Bagging
Random Forests
Boosti

Neural Networks
(aka Deep Learning)
Ties together many ideas from the course
logistic regression
template matching/nearest neighbors
decision forests / ensembles of weak learners
stochastic gradient descent
Whats new:
feature learning (end-to-end tra

Dimensionality Reduction
by PCA
Thomas Hofmann,Department of Computer Science, Brown University
5th Max-Planck Advanced Course on the Foundations of Computer Science, September 6-10 2004, Saarbrcken
Pattern Matrix
Statistics and machine learning typicall

Convolutional Neural Networks
(aka ConvNets, CNNs)
[LeNet-5, LeCun 1998]
Baby-sitting your network: monitoring accuracy
big gap = overfitting
=> increase regularization strength
no gap
=> increase model capacity
Fei-Fei Li & Andrej Karpathy
Lecture 6 - 2

CS189/CS289A
Introduction to Machine Learning
Lecture 7: The Multivariate Normal Distribution
Peter Bartlett
February 10, 2015
1 / 41
Outline
2 / 41
Outline
Probability density function.
2 / 41
Outline
Probability density function.
2-D Gaussian
2 / 41
Out

UCB - CS189
Introduction to Machine Learning
Fall 2015
Lecture 11: Embedded methods for model and
feature selection
Isabelle Guyon
ChaLearn
1
Come to my office hours
Wed 2:30-4:30 Soda 329
Last time: model search
2
Come to my office hours
Wed 2:30-4:30 So

CS189: Introduction to Machine Learning
Homework 3
Due: October 6, 2015 @ 11:59PM
Homework party: October 1, 8-10pm (Wozniak Lounge).
Submission: bCourses (no Kaggle, no Gradescope)
Submission Instructions
In your submission, include two separate files:
1

Name:
Student ID:
CS189: Introduction to Machine Learning
Homework 2
Due: September 24, 2015 at 11:59pm
Instructions:
Homework 2 is completely a written assignment, no coding involved.
Please write (legibly!) or typeset your answers in the space provide

CS 189 : Introductory Lecture
Jitendra Malik
Examples of learning problems
Recognizing digits
Classifying email as spam or not
Predicting the price of a stock 6 months from
now
Netflix problem predict rating of a movie j
by a customer i
Determine cre

CS189: Introduction to Machine Learning
Homework 3
Due: October 11th, 2016, 12:00 noon, NOT MIDNIGHT
Problem 1: Maximum Likelihood Estimation of Multivariate Gaussian Distribution
Suppose that n samples X1 , , Xn Rd are random vectors which are drawn inde

CS189/CS289A
Introduction to Machine Learning
Lecture 2: Linear classiers
Peter Bartlett
January 22, 2015
1 / 30
Linear Classiers:
x Rd , y cfw_1, 1
2 / 30
Linear Classiers
1
Training
3 / 30
Linear Classiers
1
Training
Collect labeled data.
3 / 30
Linear

CS189/CS289A
Introduction to Machine Learning
Lecture 5:
Peter Bartlett
February 3, 2015
1 / 19
Outline
2 / 19
Outline
Two facts from probability theory
2 / 19
Outline
Two facts from probability theory
Generative and discriminative models:
Gaussian class

CS189/CS289A
Introduction to Machine Learning
Lecture 4: Decision Theory
Peter Bartlett
January 29, 2015
1 / 34
Outline
2 / 34
Outline
Decision theory
2 / 34
Outline
Decision theory
Loss functions
2 / 34
Outline
Decision theory
Loss functions
Probabilisti

CS189/CS289A
Introduction to Machine Learning
Lecture 6:
Peter Bartlett
February 5, 2015
1 / 43
Outline
2 / 43
Outline
Recall: Gaussian class conditionals lead to a logistic posterior.
2 / 43
Outline
Recall: Gaussian class conditionals lead to a logistic

CS189/CS289A
Introduction to Machine Learning
Lecture 8: More on the Multivariate Normal Distribution
Peter Bartlett
February 12, 2015
1 / 36
Outline
2 / 36
Outline
Review: Diagonal covariance matrices.
2 / 36
Outline
Review: Diagonal covariance matrices.

Discussion 1
Math Review
1. Probability Review
There are n archers all shooting at the same target (bullseye) of radius 1. Let the score
for a particular archer be defined to be the distance away from the center (the lower
the score the better, and 0 is t

CS 189: Introduction to Machine Learning
Homework 1: Support Vector Machines
Due: 11:59 pm on February 10th, 2016
Introduction
Datasets
In the first part of this assignment (Problems 1 through 3), we will do
digit recognition using our own variant of the

CS189: Introduction to Machine Learning
Homework 7
Due: 11:59 pm, Friday Dec 4th, 2015
HW Party: 7-9 pm, Monday Nov 30th, 2015
This assignment contains some optional questions. It means that if you do the
optional questions, you can get bonus marks. Thes

Dimensionality Reduction
by PCA
Pattern Matrix
d
k
x
k
X=cfw_x i
N
y
k
=cfw_y
w
2
4
Measurement vectors
instance
i: instance number, e.g. a house
j: measurement, e.g. the area of a house
measurement
Digital images as gray-scale vectors
image
i: image

Unsupervised Learning
Where are we?
Parametric vs. non-parametric
Generative vs. Discriminative
Supervised vs Unsupervised
Supervised:
Given X, predict Y
Unsupervised:
Given X do something interesting
assumes that X has some structure
Discovering S