Name:
10702 Statistical Machine Learning: Midterm Exam
March 4, 2010
Submit solutions to any three of the following ve problems. Clearly indicate
which problems you are submitting solutions for. Write your answers in the space provided;
additional sheets
10/36702 Midterm Exam
There are three questions:
Question 1 35
Question 2 30
Question 3 35
Total
100
Name:
1
(1) Let (X1 , Y1 ), . . . , (Xn , Yn ) be iid. Suppose that X1 , . . . , Xn P has a density p on [0, 1]
where 0 < c p(x) C < for all x [0, 1]. As
Clustering Part II:
kmeans and Related Methods
Lets begin with a few examples.
Example 1 Figures 1 and 2 shows some synthetic examples where the clusters are meant to
be intuitively clear. In Figure 1 there are two bloblike clusters. Identifying cluster
Chapter 12
Bayesian Inference
This chapter covers the following topics:
12.1
Concepts and methods of Bayesian inference.
Bayesian hypothesis testing and model comparison.
Derivation of the Bayesian information criterion (BIC).
Simulation methods and Marko
Density Estimation
10/26702 Spring 2014
1
Introduction
Let X 1 , . . . , X n be a sample from a distribution P with density p. The goal of nonparametric
density estimation is to estimate p with as few assumptions about p as possible. We denote
the estima
Copyright c 20082010 John Lafferty, Han Liu, and Larry Wasserman
Do Not Distribute
Chapter 27
Nonparametric Bayesian Methods
.
Most of this book emphasizes frequentist methods, especially for nonparametric problems. However, there are Bayesian approaches
Assignment 5
10/36702
Due Friday May 2 at 3:00 pm
Hand in to MariAlice McShane, Baker Hall 229K
1. Given y Rn , X Rn p , consider the lasso problem
min
p
R
Rewrite this as
min
p
R
, zRn
1
y X
2
1
y z
2
2
2 +
2
2 +
1
1.
subject to z = X ,
(1)
and now, st
Assignment 1
10/36702
Due Friday Jan 24 3:00 pm
Hand In to MariAlice McShane, Baker Hall 229K
1. Let X 1 , . . . , X n P and let = E[X i ] and 2 = Var(X i ). Dene
Xn =
1 n
X i,
n i=1
S2 =
n
1 n
(X i X n )2 .
n i=1
P
(a) Prove that S 2 2 .
n
(b) Prove th
Assignment 4
10/36702
Due Friday April 11 at 3:00 pm
Hand in to MariAlice McShane, Baker Hall 229K
1. Here well study convexity (and concavity) in exponential families and generalized linear models. Consider an exponential family density (or mass) funct
Solutions to Assignment 2
10/36702
1.
(a) For any polynomial q(x) of degree k we have q(x) = T b(x) where b(x) = (1, x, . . . , x k )T and
contains the coefcients. Then
n
n
b(x i )T w i (x) = w(x)T B = b(x)T (B T B)1 B T B = b(x)T = q(x).
q(x i )w i (x)
Convexity and Optimization
Statistical Machine Learning, Spring 2014
Ryan Tibshirani (with Larry Wasserman)
1
An entirely too brief motivation
1.1
Why optimization?
Optimization problems are ubiquitous in statistics and machine learning. A huge number of
Density Clustering
10/26702 Spring 2014
1
Modes and Clusters
Let p be the density of X Rd . Assume that p has modes m 1 , . . . , m k0 and that p is a Morse
function, which means that the Hessian of p at each stationary point is nondegenerate. We
can us
Nonparametric Regression
Statistical Machine Learning, Spring 2014
Ryan Tibshirani (with Larry Wasserman)
1
Introduction, and knearestneighbors
1.1
Basic setup, random inputs
Given a random pair (X, Y ) Rd R, the function
f0 (x) = E(Y X = x)
is called
Assignment 2
10/36702
Due Friday Feb 19 3:00 pm
1. [20 points: (a) and (b)] In this question we will study knearest neighbors regression.
Consider data ( x1 , Y1 ), . . . ( xn , Yn ) Rd R. To make things simpler, we will assume that
x1 , . . . xn are fi
Assignment 4
10/36702
Due Friday April 15 at 3:00 pm
S
1. Let X 1 , . . . , X n P where X i R2 . Suppose that P is uniform on S 1 S 2 where S 1 = B(x1 , 1)
and S 2 = B(x2 , 1). Here, B(x, 1) denotes a closed ball of radius 1 centered at x. Assume that

Assignment 1
10/36702
Due Friday Jan 29 3:00 pm
1. Review questions:
(a) Let X 1 , . . . , X n P and let = E[X i ] and 2 = Var(X i ). Define
Xn =
n
1X
X i,
n i=1
S 2n =
n
1X
(X i X n )2 .
n i=1
P
(a) Prove that S 2n 2 .
(b) Prove that
p
n(X n )
Sn
N(0, 1
Assignment 2
10/36702
Due Friday Feb 19 3:00 pm
1. In this question we will study knearest neighbors regression. Consider data ( x1 , Y1 ), . . . ( xn , Yn )
Rd R. To make things simpler, we will assume that x1 , . . . xn are fixed (nonrandom).
Furthe
Assignment 1  Solutions
10/36702
Due Friday Jan 29 3:00 pm
1. Review questions:
(a) [10 points]
Let X 1 , . . . , X n P and let = E[X i ] and 2 = Var(X i ). Define
Xn =
n
1X
X i,
n i=1
S 2n =
n
1X
(X i X n )2 .
n i=1
P
(a) Prove that S 2n 2 .
(b) Prove
Assignment 3
10/36702
Due Friday March 18 by 3:00 pm
1. [15 points: (b), (d), and (f)] In this question we investigate the hinge loss used in support
vector machines. Let (X , Y ) Rd cfw_1, +1. Let m(x) = P(Y = 1 X = x). Given a function H(x)
we define
Assignment 4
10/36702
Due Friday April 15 at 3:00 pm
S
1. [30 points] Let X 1 , . . . , X n P where X i R2 . Suppose that P is uniform on S 1 S 2 where
S 1 = [1, 1]2 and S 2 = [9, 11]2 . n be a sequence of positive numbers satisfying that n 0 and
n2n
log
1. Let X1 , . . . Xn P where P has density p. Let p be the kernel density estimator with
kernel K and bandwidth h. Prove that the bias E[p(x)] p(x) is exactly
K(t)[p(x + th) p(x)]dt.
Assume now that p(x) p(y) Lx y. Use this to get an upper bound on th
Assignment 3
10/36702
Due Friday March 21 at 3:00 pm
Hand in to MariAlice McShane, Baker Hall 229K
2
1. Suppose that X i N( i , 1) for i = 1, . . . , n. We then say that W = n=1 X i has a noncentral
i
2 distribution with n degrees of freedom and nonce
4.
(a) Let X (n) denote max i=1,.,n X i . The likelihood
L( ) =
n
i =1
P(X i ; ) =
1
I( X (n) )
n
is a decreasing positive function on [X (n) , ) and 0 everywhere else. Therefore it is maxi
mized by taking n = X (n) .
(b) P(n > ) = 0 for all n so n(n ) ca
Homework 2
10702/36702 Statistical Machine Learning
Due: Friday Feb 4 3:00
Hand in to: Michelle Martin GHC 8001.
1
Convexity and Optimization
1. (Convexity)
(a) Show that 1/g(x) is convex if g is twicedierentiable, concave and positive (hint:
use compo
Homework 1
10702/36702 Statistical Machine Learning
Due: Friday Jan 21 3:00
Hand in to: Michelle Martin GHC 8001.
1. (Review of Maximum Likelihood.) Let X1 , . . . , Xn be iid random variables where
Xi cfw_1, 2, 3, 4. Let pj = P(Xi = j). Suppose there e
10702 Homework 2 Solution
Thanks to Akshay Krishnamurthy for providing his solution.
1
Convexity and Optimization
1. (Convexity)
(a) Well show that the second derivative of 1/g(x) is always positive, which implies that
1/g(x) is convex. First, the second
36702 Homework 1 Solution
Thanks to William Bishop and Rafael Stern for providing their solutions.
Problem 1
(a) Let n(j) =
i Icfw_j (xi ),
L() = n(1)
2
n(2)
3
n(3)
6 11
6
n(4)
n(1)+n(2)+n(3) (6 11)n(4)
Thus, there exists a constant k such that:
l() =
Homework 3 10/36702: Due March 22
1. Let X1 , . . . , Xn P where P has a density p on [0, 1]. Suppose that p P where
P = p : p 0,
p = 1, p(y) p(x) L x y, for all x, y [0, 1] .
1
(p(x) q(x)2 dx. Let Rn be the
We want to estimate the density p. Let d(p
10/36702 Homework 2
Due: Friday 3/1/2013
Instructions: Hand in your homework to Michelle Martin (GHC 8001) before 3:00pm on
Friday 3/1/2013.
1. Let X1 , . . . , Xn P where P has density p and 0 Xi 1. Find the asymptotic bias
of ph (0) where ph is the ker