CHAPTER 1
GENERATIVE AND DISCRIMINATIVE
CLASSIFIERS:
NAIVE BAYES AND LOGISTIC REGRESSION
Machine Learning
Copyright c 2005, 2010. Tom M. Mitchell. All rights reserved.
*DRAFT OF January 19, 2010*
*PLEASE DO NOT DISTRIBUTE WITHOUT AUTHORS
PERMISSION*
This
Support Vector and Kernel Machines
Nello Cristianini BIOwulf Technologies nello@support-vector.net http:/www.support-vector.net/tutorial.html
ICML 2001
A Little History
z
z
z
z
SVMs introduced in COLT-92 by Boser, Guyon, Vapnik. Greatly developed ever sin
c 2016 Robert Nowak
Note on Cross-Validation
Let f be the minimizer of the regularized problem
(
)
N
1 X
min
L(yi , f (xi ) + c(f ) ,
f F
N i=1
(1)
where F is a class of predictors, L is a loss function (e.g., squared error, logistic loss, hinge loss, etc
c 2016 Robert Nowak
Note on Lasso
This is a short note is based on the analysis framework developed in [1]. Let w? be a s-sparse vector and
suppose that we observe
y = Xw? + ,
iid
where X is a known n p matrix with entries Xij N (0, 1) and is an unknown e
c 2016 Robert Nowak
Note on Proximal Gradient Algorithms
These notes consider optimization problems of the following form
min f (w) + c(w) ,
wRp
where the functions f and c are convex, and f is also differentiable. Special cases include ridge regression
a
2016 Rebecca Willett
Backpropagation in Neural Networks
Artificial neural networks can be used to learn predictors in a wide variety of machine learning settings.
The basic idea is take a feature vector x Rp , compute different weighted combinations of t
2016 Rebecca Willett
Stochastic Gradient Descent
In many machine learning and signal processing settings, we wish to solve an optimization problem of
the form
minimize f (w)
w
where the objective function can be decomposed as
f (w) =
n
X
fi (w).
i=1
For
Solution to CS760 HW 4 (Spring 2010)
1. Initial values: all Q=3.
L
Start
Q=3
R
a
R Q=3
b
L
Q=3
C
Q=3
C
L
Q=3
Q=3
Q=3
d
R
Q=3
end
C
Q=3
i) For the first episode: start->a->b->d->end, we have the following Q values:
Step 1: start->a. We have Q ( start , L)
CS 760 - Homework 4
Out: 4/12/10
Due: 4/19/10
50 points
Consider the deterministic reinforcement environment drawn below. The numbers on the arcs
are the immediate rewards. Let the discount rate equal 0.8 and the probability of taking an
exploration step
Journal of Machine Learning Research 11 (2010) 61-87
Submitted 11/09; Published 1/10
Model Selection: Beyond the Bayesian/Frequentist Divide
Isabelle Guyon
GUYON @ CLOPINET. COM
ClopiNet
955 Creston Road
Berkeley, CA 94708, USA
Amir Saffari
SAFFARI @ ICG
The Case Against Accuracy Estimation for Comparing Induction Algorithms
Bell Atlantic Science and Tech 400 Westchester Avenue White Plains, NY 10604
foster@basit.com
Foster Provost
Bell Atlantic Science and Tech 400 Westchester Avenue White Plains, NY 106
c 2016 Robert Nowak
Rademacher Complexity and Learning with Convex Loss Functions
1
Convex Losses
Suppose we have training data cfw_xi , yi ni=1 , a set of prediction rules F, and a loss function L. Empirical risk
minimization is the optimization
n
1X
min