CS229 Lecture notes
Andrew Ng
Part IV
Generative Learning algorithms
So far, weve mainly been talking about learning algorithms that model
p(y |x; ), the conditional distribution of y given x. For instance, logistic
regression modeled p(y |x; ) as h (x) =
CS229 Lecture notes
Andrew Ng
Supervised learning
Lets start by talking about a few examples of supervised learning problems.
Suppose we have a dataset giving the living areas and prices of 47 houses
from Portland, Oregon:
Living area (feet2 )
2104
1600
2
CS229 Lecture notes
Andrew Ng
Part V
Support Vector Machines
This set of notes presents the Support Vector Machine (SVM) learning algorithm. SVMs are among the best (and many believe are indeed the best)
o-the-shelf supervised learning algorithm. To tell
CS229 Lecture notes
Andrew Ng
Part VI
Learning Theory
1
Bias/variance tradeo
When talking about linear regression, we discussed the problem of whether
to t a simple model such as the linear y = 0 + 1 x, or a more complex
model such as the polynomial y = 0
CS229 Lecture notes
Andrew Ng
Part XIII
Reinforcement Learning and
Control
We now begin our study of reinforcement learning and adaptive control.
In supervised learning, we saw algorithms that tried to make their outputs
mimic the labels y given in the tr
CS229 Lecture notes
Andrew Ng
Part XII
Independent Components
Analysis
Our next topic is Independent Components Analysis (ICA). Similar to PCA,
this will nd a new basis in which to represent our data. However, the goal
is very dierent.
As a motivating exa
CS229 Lecture notes
Andrew Ng
The k -means clustering algorithm
In the clustering problem, we are given a training set cfw_x(1) , . . . , x(m) , and want to group the data into a few cohesive clusters. Here, x(i) Rn as usual; but no labels y (i) are given
CS229 Lecture notes
Andrew Ng
Part XI
Principal components analysis
In our discussion of factor analysis, we gave a way to model data x Rn as
approximately lying in some k -dimension subspace, where k n. Specifically, we imagined that each point x(i) was
CS229 Lecture notes
Andrew Ng
Part X
Factor analysis
When we have data x(i) Rn that comes from a mixture of several Gaussians,
the EM algorithm can be applied to t a mixture model. In this setting, we
usually imagine problems where we have sucient data to
CS229 Lecture notes
Andrew Ng
Part IX
The EM algorithm
In the previous set of notes, we talked about the EM algorithm as applied to
tting a mixture of Gaussians. In this set of notes, we give a broader view
of the EM algorithm, and show how it can be appl
CS229 Lecture notes
Andrew Ng
Mixtures of Gaussians and the EM algorithm
In this set of notes, we discuss the EM (Expectation-Maximization) for density estimation.
Suppose that we are given a training set cfw_x(1) , . . . , x(m) as usual. Since
we are in
CS229 Lecture notes
Andrew Ng
Part VII
Regularization and model
selection
Suppose we are trying to select among several dierent models for a learning
problem. For instance, we might be using a polynomial regression model
h (x) = g (0 + 1 x + 2 x2 + + k xk
CS229 Lecture notes
Andrew Ng
1
The perceptron and large margin classiers
In this nal set of notes on learning theory, we will introduce a dierent
model of machine learning. Specically, we have so far been considering
batch learning settings in which we a