CSE515: Statistical Methods in Computer Science
Winter, 2016
Homework 2
Due at noon on February 10, 2016
G UIDELINES : You can brainstorm with others, but please solve the problems and write up
the answers by yourself. You may use textbooks (Koller & Frie
CSE515: Statistical Methods in Computer Science
Winter, 2016
Homework 1
Due at noon on January 27, 2016
G UIDELINES : You can brainstorm with others, but please solve the problems and write up
the answers by yourself. You may use textbooks (Koller & Fried
CSE446: Point Es.ma.on
Winter 2016
Ali Farhadi
Slides adapted from Carlos Guestrin, Dan Klein, and Luke ZeFlemoyer
Your rst consul.ng job
A billionaire from the suburbs of SeaFle asks you
a ques.on:
He says:
CSE446: Nave Bayes
Winter 2016
Ali Farhadi
Slides adapted from Carlos Guestrin, Dan Klein, Luke ZeIlemoyer
Supervised Learning: nd f
Given: Training set cfw_(xi, yi) | i = 1 n
Find: A good approximaNon to f : X Y
E
CSE446: SVMs
Winter 2016
Ali Farhadi
Slides adapted from Carlos Guestrin, and Luke ZeDelmoyer
Linear classiers Which line is beDer?
< 0
0
w!x +
w
<0
0
0
w!x +
w
w!x +
w
=0
>0
0
w!x +
w
w!x +
w
0
> 0
Pick the one with
CSE446: Logis-c Regression
Winter 2016
Ali Farhadi
Slides adapted from Carlos Guestrin and Luke ZeElemoyer
Lets take a(nother) probabilis-c approach!
Previously: directly
es-mate the data
distribu-on P(X,Y)!
chal
CSE 446
Machine Learning
Instructor: Ali Farhadi
ali@cs.washington.edu
Slides adapted from Pedro Domingos, Carlos Guestrin, and Luke Zettelmoyer
Logistics
Instructor: Ali Farhadi
Email: ali@cs
Office: CSE 652
TAs:
Naozumi Hiranuma (hiranumn@cs)
Willia
CSE515: Statistical Methods in Computer Science
Winter, 2016
Homework 4
Due at noon on March 9, 2016
G UIDELINES : You can brainstorm with others, but please solve the problems and write up
the answers by yourself. You may use textbooks (Koller & Friedman
CSE515: Statistical Methods in Computer Science
Winter, 2016
Homework 3
Due at noon on February 24, 2016
G UIDELINES : You can brainstorm with others, but please solve the problems and write up
the answers by yourself. You may use textbooks (Koller & Frie
CSE446: Kernels
Winter 2016
Ali Farhadi
Slides adapted from Carlos Guestrin, and Luke ZeClemoyer
Top 3:
#3 Akash Gupta
#2 Karanbir Singh
#1 Pascale Wallace Patterson
What if the data is not linearly separable?
Use featur
CSE446: Perceptron
Winter 2016
Ali Farhadi
Slides adapted from Dan Klein, Luke ZeElemoyer
Who needs probabiliHes?
Previously: model data
with distribuHons
Joint: P(X,Y)
e.g. Nave Bayes
CondiHonal: P(Y|X)
e.
Boosting
Machine Learning CSE446
Carlos Guestrin
University of Washington
April 24, 2013
Carlos Guestrin 2005-2013
1
Fighting the bias-variance tradeoff
n
Simple (a.k.a. weak) learners are good
e.g.,
nave Bayes, logistic regression, decision stumps
(or s
LASSO: Big
Picture
Machine Learning CSE446
Carlos Guestrin
University of Washington
April 10, 2013
Carlos Guestrin 2005-2013
1
Sparsity
n
Vector w is sparse, if many entries are zero:
n
Very useful for many tasks, e.g.,
Efficiency: If size(w) = 100B, each
http:/www.cs.washington.edu/education/courses/cse446/13sp/
Whats learning?
Point Estimation
Machine Learning CSE446
Carlos Guestrin
University of Washington
2005-2013 Carlos Guestrin
April 1, 2013
1
What is Machine Learning ?
2005-2013 Carlos Guestrin
2
1
Kernels
Machine Learning CSE446
Carlos Guestrin
University of Washington
May 3, 2013
Carlos Guestrin 2005-2013
1
Linear Separability: More formally, Using Margin
n
Data linearly separable, if there exists
a
vector
a margin
n
Such that
Carlos Guestrin 20
http:/www.cs.washington.edu/education/courses/cse446/13sp/
Point Estimation
Machine Learning CSE446
Carlos Guestrin
University of Washington
2005-2013 Carlos Guestrin
April 3, 2013
1
Your first consulting job
n
A billionaire from the suburbs of Seattle as
Regularization
Machine Learning CSE446
Carlos Guestrin
University of Washington
2005-2013 Carlos Guestrin
April 10, 2013
1
Regularization in Linear Regression
n
Overfitting usually leads to very large parameter choices, e.g.:
-2.2 + 3.1 X 0.30 X2
n
-1.1 +
Whats the Perceptron
Optimizing?
Machine Learning CSE446
Carlos Guestrin
University of Washington
May 1, 2013
Carlos Guestrin 2005-2013
The Perceptron Algorithm
n
n
[Rosenblatt 58, 62]
Classification setting: y in cfw_-1,+1
Linear model
n
1
Prediction:
Tr
CSE446: Decision Trees
Winter 2016
Ali Farhadi
Slides adapted from Carlos Guestrin, Andrew Moore, and Luke ZeGelmoyer
AdministraIve stu
Oce hours
Discussion board
Anonymous feedback form
Contact: cse446-s
CSE446: Decision Tree
Part2
Winter 2016
Ali Farhadi
Slides adapted from Carlos Guestrin, Andrew Moore, and Luke ZeHelmoyer
So far
Decision trees
They will overt
How to split?
When to stop?
2
What den
Stochastic Gradient
Descent
Machine Learning CSE446
Carlos Guestrin
University of Washington
April 19, 2013
Carlos Guestrin 2005-2013
Logistic Regression
n
1
Logistic
function
(or Sigmoid):
Learn P(Y|X) directly
Assume a particular functional form for lin