Problem Set 2
Due in class on paper, Wednesday October 22, 2014
NB: This problem set is for individual work, not group work.
1. p-value versus n.
Here we investigate the way that the p-value depends on sample size.
Note that the R functions for the F dist
STATS-305, Fall 2012
Solutions for HW 2
Instructor: Professor Trevor Hastie
TAs: Sam, Zhen and Luo
For questions about grading, please see the TAs
1
Weisberg Exercise 3.2.
(a) We consider regression on
X = x1
xp
and also on X , the rst p 1 columns of X .
STATS-305, Fall 2012
Solutions for HW 1
Instructor: Professor Trevor Hastie
TAs: Sam, Zhen and Luo
For questions about grading, please see the TAs
1
Problem 1
Reading assignment from text-book.
2
Problem 2
(a). We scale by 1/n because the X s are treated
STATS-305, Fall 2012
Solutions for HW 5
Instructor: Professor Trevor Hastie
TAs: Sam, Zhen and Luo
For questions about grading, please see the TAs
Problem 1
(a) The linear and cubic spline ts are shown below (Linear: blue; Cubic: red):
1.4
Abalone data wi
STATS-305, Fall 2012
Solutions for HW 3
Instructor: Professor Trevor Hastie
TAs: Sam, Zhen and Luo
For questions about grading, please see the TAs
Problem 1
By reordering the indices, without loss of generality we take j = p. Let X = QR where Q is n n ort
Statistics 305
Homework 2, due Thursday, October 18,
2012.
All questions with a bold R next to the question number can be solved in
groups up to size three. For such groups a single writeup can be turned in, but
make sure to indicate who the three are. Al
Statistics 305
Homework 1, due Thursday, October 8, 2015 by 5pm.
This problem set is partly aimed at getting you going with R. The coursework webpage points you to a useful R tutorial the rst few chapters of Dalgaard Introductory Statistics with R would a
Statistics 305
Homework 1, due Thursday, October 8, 2015 by 5pm.
This problem set is partly aimed at getting you going with R. The coursework webpage points you to a useful R tutorial the rst few chapters of Dalgaard Introductory Statistics with R would a
STAT305 FALL 2015 Homework 4
Solutions Thanks to Matteo Sesia
Problem 1
70
80
90
100
110
120
130
W draw a boxplot of IQf for each of the three classes. The boxplot does not detect any
outliers. Moreover, we see that the mean of the response variable IQf i
1
The Lasso and related
penalization methods
Basic lasso, related approaches, generalizations, novel applications
Lasso and related methods
3
Linear regression via the Lasso (Tibshirani, 1995)
Outcome variable yi , for cases i = 1, 2, . . . n, features x
Statistics 305
Homework 5, due Thursday, Dec 3, 2015 by 5pm.
All questions with a bold R next to the question number can be solved
in groups up to size three. For such groups a single writeup can be turned
in, but make sure to indicate who the three are.
Statistics 305
Homework 4, due Thursday, November 19, 2015 by 5pm.
All questions with a bold R next to the question number can be solved
in groups up to size three. For such groups a single writeup can be turned
in, but make sure to indicate who the three
Statistics 305
Homework 2, due Thursday, October 22, 2015 by 5pm.
All questions with a bold R next to the question number can be solved
in groups up to size three. For such groups a single writeup can be turned
in, but make sure to indicate who the three
STATS 305 Problem Set 3 Notes
Jeha Yang
1. (a) T (b) F (c) T (d) F
n
n
2. (a) We want to nd the number of distinct (ordered) solutions of i=1 xi = n, xi N cfw_0 i.e.
i=1 yi =
2n, yi N. Now consider 2n points in a row, then putting n 1 dividers among 2n 1
c
SLDM III Hastie & Tibshirani - February 23, 2009
Cross-validation and bootstrap
Bootstrap Methods
bootstrap is a general tool for assessing statistical accuracy.
Given a training set z = (z1 , z2 , . . . , zN ) where zi = (xi , yi ), the
basic idea is
c
SLDM III Hastie & Tibshirani - September 6, 2009
Wisdom of Crowds
10
Wisdom of Crowds
Consensus
Individual
8
10 movie categories, 4 choices
in each category
4
6
50 people, for each question 15
informed and 35 guessers
For informed people,
Pr(correct)
Chapter 1
A Simple, Linear, Mixed-eects Model
In this book we describe the theory behind a type of statistical model called
mixed-eects models and the practice of tting and analyzing such models
using the lme4 package for R. These models are used in many
Multiple Regression by Successive Orthogonalization.
From Elements of Statistical Learning, Hastie et al, sec 3.2.3.
Data y, x1 , x2 , . . . xp .
1. Initialize z0 = x0 = 1.
2. For j = 1, 2, . . . , p
Regress xj on z0 , z1 , . . . , , zj1 to produce coecie
Problem Set 3
Due in class on paper, Wednesday November 5, 2014
This problem set is for joint work (usual rules).
1. ANOVA Cuckoos lay their eggs in nests of other birds. The le
cuckoo.txt has lengths of Cuckoo eggs (in mm) for nests of sparrows,
robins a