STATS 216V
Homework 4 Problem 1
(a) Plot of test MSE for Bagging and Random Forest
(b) Importance of features
Yu Zhang
STATS 216V
Homework 4 Problem 1
Yu Zhang
Two measures of variable importance are reported. The former %IncMSE is based upon the
mean dec
STATS216v Introduction to Statistical Learning
Stanford University, Summer 2016
Problem Set 2
Due: July 15
Remember the university honor code. All work and answers must be your own.
Problem 1
Suppose we collect data for a group of students in a statistics
Stats216: Session 8
Predicting ALS Disease Progression
ALS (amyotrophic lateral sclerosis), or Lou Gehrigs disease, is a fatal neurodegenerative disease with no
known cure and few known causes. In July of 2012, Prize4Life launched a challenge to most accu
STATS216v Introduction to Statistical Learning
Stanford University, Summer 2016
Problem Set 2 (Solutions)
Total: 100 points
Problem 1
10 points. Grading notes: 5 points for each part. Identifying the correct formula is worth
3 points, and the correct form
STATS 216 Introduction to Statistical Learning
Stanford University, Summer 2015
Problem Set 1 Solutions
Total: 65 points
1. 15 points: (a) 6 points (b) 6 points (c) 3 points
Grading notes: For parts (a) and (b), each example is worth 2 points: 1 point for
STATS 216 Introduction to Statistical Learning
Stanford University, Winter 2015
Problem Set 2
Due: Wednesday, February 10, 2016
Remember the university honor code. All work and answers must be your own.
1. Suppose we collect data for a group of students i
STATS 216 Introduction to Statistical Learning
Stanford University, Winter 2016
Problem Set 1 Solutions
Total: 68(+1) points
1. 15 points: (a) 3 points (b) 6 points (c) 6 points
Grading notes: For parts (b) and (c), each example is worth 2 points: 1 point
STATS 216 Introduction to Statistical Learning
Stanford University, Summer 2015
Problem Set 2 Solutions
Total: 65 points
1. 15 points
Grading notes: Each part is worth 3 points, 2 for the correct answer and 1 for a valid
explanation. Student explanations
STATS216v Introduction to Statistical Learning
Stanford University, Summer 2016
Problem Set 1
Due: July 1
Remember the university honor code. All work and answers must be your own.
Problem 1
Explain whether each scenario below is a regression, classificat
STATS216v Introduction to Statistical Learning
Stanford University, Summer 2015
Problem Set 1
Due: Friday, July 3
Remember the university honor code. All work and answers must be your own.
1. In this question we consider some real-life applications of sta
STATS 216 Introduction to Statistical Learning
Stanford University, Summer 2015
Problem Set 3 Solutions
Total: 65 points
1. 14 points: one point for each correct answer (a, b, and c i-v) and one point for each
explanation that correctly explains either th
STATS216v Introduction to Statistical Learning
Stanford University, Summer 2015
Problem Set 2
Due: Friday, July 17
Remember the university honor code. All work and answers must be your own.
1. Suppose we estimate the regression coecients in a linear regre
STATS 216 Introduction to Statistical Learning
Stanford University, Winter 2015
Problem Set 2 Solutions
Total: 70 points
1. Add me 4 points
Grading notes: 2 points for each part. Identifying the correct formula is worth 1 point,
and the correct formula pl
STATS216v Introduction to Statistical Learning
Stanford University, Summer 2015
Problem Set 4
Due: Wednesday, August 12
Remember the university honor code. All work and answers must be your own.
1. Recall the body dataset from problem 3 of Homework 3. In
STATS216v Introduction to Statistical Learning
Stanford University, Summer 2015
Problem Set 3
Due: Friday, July 31
Remember the university honor code. All work and answers must be your own.
1. We do best, forward, and backward stepwise selection on a sing
Vectorization, Timing, and Parallelization Solutions
Stats 216
27 January 2016
In this class were going to take a step back from statistical analysis and instead take a closer look at whats
going on under the hood when you run a program in R. This is incr
Regression and Simulation
This is an introductory R session, so it may go slowly if you have never used R before. Do not be discouraged.
A great way to learn a new language like this is to plunge right in.
We will simulate some data suitable for a regress
Vectorization, Timing, and Parallelization
Stats 216
27 January 2016
In this class were going to take a step back from statistical analysis and instead take a closer look at whats
going on under the hood when you run a program in R. This is incredibly imp
Prchlern 1.
(e)
il-
ii
iii,
ii,
iii,
ii,
iii
STATSQlﬁ: Problem Set 1
[Ruixi Lin] — [r1in2]
Jenner}; 25, 2016
An eccncniist wents tc predict the price per squere fcct hesed
en the hcusing erees. This is e regressicn rncdel where hcusing
eree is the predict
Stats216: Session 2
Linear Regression Analysis of NCAA Basketball Data
In this in-class session, we will analyze a data set containing the outcomes of every game in the 2012-2013
regular season, and the postseason NCAA tournament. There are 5541 games and
Regression and Simulation
This is an introductory R session, so it may go slowly if you have never used R before. Do not be discouraged.
A great way to learn a new language like this is to plunge right in.
We will simulate some data suitable for a regress
Stats216: Session 2
Linear Regression Analysis of NCAA Basketball Data
In this in-class session, we will analyze a data set containing the outcomes of every game in the 2012-2013
regular season, and the postseason NCAA tournament. There are 5541 games and
Stats216: Session 2
Linear Regression Analysis of NCAA Basketball Data
In this in-class session, we will analyze a data set containing the outcomes of every game in the 2012-2013
regular season, and the postseason NCAA tournament. There are 5541 games and
Principal Component Analysis
Stats 216 - Session 6
02/10/2016
Using Principal Component Regression
1. Getting Started.
2. A Small Example (Height and Weight).
3. A Bigger Problem
4. So What?
Using Principal Component Regression
This class is designed
Principal Component Analysis
Stats 216 - Session 6
02/10/2016
Using Principal Component Regression
1. Getting Started.
2. A Small Example (Height and Weight).
3. A Bigger Problem
4. So What?
Using Principal Component Regression
This class is designed
Stats216: Session 7
Nonlinear classification
This class will give you some practice in using splines, local regression, and generalized additive models
to perform classification.
1. Getting started
Lets begin by loading the data:
# CHD =
read.csv("http:/s
In the likelihood function,
\( L(\beta_1,\beta_2) = \prod\limits_cfw_i:y_i=1p(x_i) \prod\limits_cfw_i':y_cfw_i'=0(1-p(x_cfw_i') \)
The first term of \( \prod\limits_cfw_i':y_cfw_i'=0(1-p(x_cfw_i') \) is the conditional probability of Y=0 given \( x_i \).