Text Mining and Classification
What is Data Mining?
the study of collecting, cleaning, processing,
analyzing, and gaining useful insights from data
(Data Mining: The Textbook, Charu C. Aggarwal, Springer 2015)
collection - sensors, surveys, software too

Clustering
As usual, some slides or portions of
slides taken from:
R. Mooney, T. Mitchell, Intro to IR
What if we do not have labeled examples?
Examples:
Calls to a help desk? We dont know the areas that are going to
have problems!
Articles what are t

Mehryar Mohri
Foundations of Machine Learning 2015
Courant Institute of Mathematical Sciences
Homework assignment 3
November 24, 2015
Due: December 07, 2015
A. Boosting-type Algorithm
1. Show that for all u R and integer p > 1, 1u0 p (u) where
p (u) = max

Mehryar Mohri
Foundations of Machine Learning 2015
Courant Institute of Mathematical Sciences
Homework assignment 2
October 23, 2015
Due: November 09, 2015
A. VC-dimension of convex combinations
1. Let H be a family of functions mapping from an input spac

Some slides taken from (see Week 2) and R.
Mooney and T. Mitchell
K- Nearest Neighbor
(William Cohen)
Training method:
Save the training examples
At prediction time:
Find the k training examples (x1,y1),(xk,yk) that are closest to
the test example x

Support Vector Machines: Linear Separators
Binary classification can be viewed as the task of
separating classes in feature space:
wTx + b = 0
wTx + b > 0
wTx + b < 0
f(x) = sign(wTx + b)
Linear Separators
Which of the linear separators is optimal?
Clas

Mehryar Mohri
Foundations of Machine Learning
Courant Institute of Mathematical Sciences
Homework assignment 3
October 31, 2016
Due: A. November 11, 2016; B. November 22, 2016
A. Boosting
1. Implement AdaBoost with boosting stumps and apply the algorithm

Mehryar Mohri
Foundations of Machine Learning
Courant Institute of Mathematical Sciences
Homework assignment 1
September 17, 2016
Due: October 04, 2016
A. Probability tools
1. Let f : (0, +) R+ be a function admitting an inverse f 1 and let X be
a random

Mehryar Mohri
Foundations of Machine Learning
Courant Institute of Mathematical Sciences
Homework assignment 1
September 16, 2016
Due: October 04, 2016
A. Probability tools
1. Let f : (0, +) R be a function admitting an inverse f 1 and let X be
a random v

Mehryar Mohri
Foundations of Machine Learning
Courant Institute of Mathematical Sciences
Homework assignment 2
October 04, 2016
Due: October 18, 2016
A. Rademacher complexity
The definitions and notation are those introduced in the lectures slides.
1. Wha