David Rosenberg
`1 and `2 Regularization
David Rosenberg
New York University
February 4, 2016
(New York University)
DS-GA 1003
February 4, 2016
1 / 38
David Rosenberg
Tikhonov and Ivanov Regularization
Tikhonov and Ivanov Regularization
(New York Universi

David Rosenberg
Kernel Methods
David Rosenberg
New York University
February 24, 2016
(New York University)
DS-GA 1003
February 24, 2016
1 / 39
David Rosenberg
Setup and Motivation
Setup and Motivation
(New York University)
DS-GA 1003
February 24, 2016
2 /

David Rosenberg
Classification and Regression Trees
David Rosenberg
New York University
February 22, 2016
(New York University)
DS-GA 1003
February 22, 2016
1 / 39
David Rosenberg
Regression Trees
Regression Trees
(New York University)
DS-GA 1003
February

David Rosenberg
Excess Risk Decomposition
David Rosenberg
New York University
February 3, 2016
(New York University)
DS-GA 1003
February 3, 2016
1 / 31
David Rosenberg
Review: Statistical Learning Theory
Review: Statistical Learning Theory
(New York Unive

David Rosenberg
`1 and `2 Regularization
David Rosenberg
New York University
February 3, 2016
(New York University)
DS-GA 1003
February 3, 2016
1 / 37
David Rosenberg
Tikhonov and Ivanov Regularization
Tikhonov and Ivanov Regularization
(New York Universi

David Rosenberg
Excess Risk Decomposition
David Rosenberg
New York University
February 7, 2016
(New York University)
DS-GA 1003
February 7, 2016
1 / 31
David Rosenberg
Review: Statistical Learning Theory
Review: Statistical Learning Theory
(New York Unive

Linear Support Vector Machines
David S. Rosenberg
1
The Support Vector Machine
For a linear support vector machine (SVM), we use the hypothesis space of
affine functions
F = f (x) = wT x + b | w Rd , b R
and evaluate them with respect to the SVM loss fu

Extreme Abridgement of Boyd and
Vandenberghes Convex Optimization
Compiled by David Rosenberg
Abstract
Boyd and Vandenberghes Convex Optimization book is very well-written and a pleasure to
read. The only potential problem is that, if you read it sequenti

Directional Derivatives and First
Order Approximations
David S. Rosenberg
1
Directional Derivative and First Order
Approximations
Let f be a differentiable function f : Rd R. We define the directional
derivative of f at the point x Rd in the direction v

A Bit About Hilbert Spaces
David Rosenberg (New York University )
David Rosenberg
New York University
February 24, 2016
DS-GA 1003
February 24, 2016
1/9
Inner Product Space (or Pre-Hilbert Spaces)
An inner product space (over reals) is a vector space V an

David Rosenberg
Lagrangian Duality and Convex Optimization
David Rosenberg
New York University
February 10, 2016
(New York University)
DS-GA 1003
February 10, 2016
1 / 30
David Rosenberg
Introduction
Introduction
(New York University)
DS-GA 1003
February

David Rosenberg
CitySense and CabSense
David Rosenberg
New York University
April 6, 2016
(New York University)
DS-GA 1003
April 6, 2016
1 / 39
David Rosenberg
The CitySense and CabSense Problems
The CitySense and CabSense Problems
(New York University)
DS

David Rosenberg
Multiclass Classification
David Rosenberg
New York University
March 30, 2016
(New York University)
DS-GA 1003
March 30, 2016
1 / 48
David Rosenberg
Introduction
Introduction
(New York University)
DS-GA 1003
March 30, 2016
2 / 48
Introducti

David Rosenberg
Bayesian Regression
David Rosenberg
New York University
April 27, 2016
(New York University)
DS-GA 1003
April 27, 2016
1 / 23
David Rosenberg
Bayesian Statistics: Recap
Bayesian Statistics: Recap
(New York University)
DS-GA 1003
April 27,

David Rosenberg
Gradient Boosting, Continued
David Rosenberg
New York University
March 24, 2016
(New York University)
DS-GA 1003
March 24, 2016
1 / 16
David Rosenberg
Review: Gradient Boosting
Review: Gradient Boosting
(New York University)
DS-GA 1003
Mar

David Rosenberg
Gradient Boosting
David Rosenberg
New York University
March 24, 2016
(New York University)
DS-GA 1003
March 24, 2016
1 / 22
David Rosenberg
Review: AdaBoost and FSAM
Review: AdaBoost and FSAM
(New York University)
DS-GA 1003
March 24, 2016

David Rosenberg
Boosting
David Rosenberg
New York University
March 23, 2016
(New York University)
DS-GA 1003
March 23, 2016
1 / 38
David Rosenberg
Boosting Introduction
Boosting Introduction
(New York University)
DS-GA 1003
March 23, 2016
2 / 38
Boosting

David Rosenberg
K -Means and Gaussian Mixture Models
David Rosenberg
New York University
April 27, 2016
(New York University)
DS-GA 1003
April 27, 2016
1 / 42
David Rosenberg
K -Means Clustering
K -Means Clustering
(New York University)
DS-GA 1003
April 2

David Rosenberg
Conditional Probability Models
David Rosenberg
New York University
April 6, 2016
(New York University)
DS-GA 1003
April 6, 2016
1 / 49
David Rosenberg
Maximum Likelihood Estimation
Maximum Likelihood Estimation
(New York University)
DS-GA

David Rosenberg
Boosting
David Rosenberg
New York University
March 9, 2016
(New York University)
DS-GA 1003
March 9, 2016
1 / 37
David Rosenberg
Boosting Introduction
Boosting Introduction
(New York University)
DS-GA 1003
March 9, 2016
2 / 37
Boosting Int

Completing the Square
David S. Rosenberg
1
Completing the Square (Univariate)
You may remember from elementary algebra the notion of completing the
square. Given a variable x R, consider the second order polynomial in x:
x2 + bx + c.
(1.1)
We would like t

David Rosenberg
Bootstrap, Bagging, and Random Forests
David Rosenberg
New York University
March 9, 2016
(New York University)
DS-GA 1003
March 9, 2016
1 / 35
David Rosenberg
Bias and Variance
Bias and Variance
(New York University)
DS-GA 1003
March 9, 20

David Rosenberg
Lagrangian Duality and Convex Optimization
David Rosenberg
New York University
February 10, 2016
(New York University)
DS-GA 1003
February 10, 2016
1 / 30
David Rosenberg
Introduction
Introduction
(New York University)
DS-GA 1003
February

David Rosenberg
Support Vector Machines
David Rosenberg
New York University
February 10, 2016
(New York University)
DS-GA 1003
February 10, 2016
1 / 29
David Rosenberg
The SVM as a Quadratic Program
The SVM as a Quadratic Program
(New York University)
DS-

Differentiation and its applications
Levent Sagun
New York University
January 28, 2016
1 / 11
Example: Least Squares
Suppose we observe the input x Rn , take action A Rmn , and
observe the output b Rm , and evaluate through mean square.
Loss function: L(

David Rosenberg
Introduction to Statistical Learning Theory
David Rosenberg
New York University
January 31, 2016
(New York University)
DS-GA 1003
January 31, 2016
1 / 34
Decision Theory: High Level View
What types of problems are we solving?
In data scien

David Rosenberg
Gradient and Stochastic Gradient Descent
David Rosenberg
New York University
January 27, 2016
(New York University)
DS-GA 1003
January 27, 2016
1 / 19
Linear Least Squares Regression
Setup
Input space X = Rd
Output space Y = R
Action space

Machine Learning and Computational Statistics
(DS-GA 1003)
David Rosenberg
New York University
January 27, 2016
David Rosenberg (New York University)
DS-GA 1003
January 27, 2016
1 / 11
Logistics
Class webpage: https:/davidrosenberg.github.io/ml2016
Syllab

David Rosenberg
Lasso, Ridge, and Elastic Net
David Rosenberg
New York University
February 4, 2016
(New York University)
DS-GA 1003
February 4, 2016
1 / 14
The Grouping Issue
A Very Simple Model
Suppose we have one feature x1 R.
Response variable y R.
Got

David Rosenberg
Directional Derivatives and Optimality
David Rosenberg
New York University
February 4, 2016
(New York University)
DS-GA 1003
February 4, 2016
1/9
David Rosenberg
Convex Sets and Functions
Convex Sets and Functions
(New York University)
DS-