Chapter 2
The Truth about Linear
Regression
We need to say some more about how linear regression, and especially about how it
really works and how it can fail. Linear regression is important because
1. its a fairly straight
Homework Assignment 2: The Advantages of
Backwardness
36-402, Data Analysis, Spring 2013
Due at 11:59 pm on Monday, 27 January 2013
Many theories of economic growth say that its easier for poor countries to
grow faster than rich countries catching up, or
Homework 1: Whats That Got to Do with the
Price of Condos in California?
36-402, Advanced Data Analysis, Spring 2013
Due at 11:59 pm on Monday, 21 January 2013
Problem: As a warm-up and refresher in using linear regression to explore
relationships between
Homework Assignment 4: How the North
American Mammalian Paleofauna Got a Crook
in Its Regression Line
36-402, Advanced Data Analysis, Spring 2013
Due at 11:59 pm on Monday, 11 February 2013
Turn in a single PDF le, with text and all gures, and a le name
i
Chapter 8
Additive Models
8.1
Partial Residuals and Backtting for Linear Models
The general form of a linear regression model is
p
x
= = + =
E Y |X x
j xj
0
(8.1)
j =0
where for j 1 : p, the x j are the components of , and x0 is always the constant
x
1.
Chapter 9
Programming
The ability to read, understand, modify and write simple pieces of code is an essential
skill for modern data analysis. Lots of high-quality software already exists for specic
purposes, which you can and should use, but statisticians
Chapter 4
Using Nonparametric
Smoothing in Regression
Having spent long enough running down linear regression, it is time to turn to constructive alternatives, which are (also) based on smoothing.
Recall the basic kind of smoothing we are interested in: w
Chapter 7
Splines
7.1
Smoothing by Directly Penalizing Curve Flexibility
Lets go back to the problem of smoothing one-dimensional data. We imagine, that
is to say, that we have data points (x1 , y1 ), (x2 , y2 ), . . . (xn , yn ), and we want to nd a
func
Chapter 10
Testing Parametric Regression
Specications with
Nonparametric Regression
10.1
Testing Functional Forms
One important, but under-appreciated, use of nonparametric regression is in testing
whether parametric regressions are well-specied.
The typi
Chapter 6
Moving Beyond Conditional
Expectations: Weighted Least
Squares, Heteroskedasticity,
Local Polynomial Regression
So far, all our estimates have been based on the mean squared error, giving equal importance to all observations. This is appropriate
Homework Assignment 7: Red Brain, Blue Brain
36-402, Advanced Data Analysis, Spring 2013
Due at 11:59 pm on Monday, 25 March 2013
The data set n90 pol.csv contains information on 90 university students who
participated in a psychological experiment design
Homework 5: Its Not the Heat that Gets to
You, Its the Sustained Heat with Pollution
36-402, Advanced Data Analysis
Due at 11:59 pm on Monday, 18 February 2013
The data set chicago, in the package gamair, contains data on the relationship between air poll
Chapter 1
Regression: Predicting and
Relating Quantitative Features
1.1
Statistics, Data Analysis, Regression
Statistics is the branch of mathematical engineering which designs and analyses methods for drawing reliable i
Chapter 3
Evaluating Statistical Models:
Error and Inference
3.1
What Are Statistical Models For? Summaries, Forecasts, Simulators
There are (at least) three levels at which we can use statistical models in data analysis
Exam 2: Choosing a Better History
36-402, Advanced Data Analysis
Due at 11:59 pm on Monday, 15 April 2013
Instructions
Please read the problem background carefully, before beginning the data analysis. Adequate data analysis here will require you to go bey
Exam 1: Nice Demo City, But Will It Scale?
36-402, Advanced Data Analysis
Due at 11:59 pm on Monday, 4 March 2013
Instructions
Please read the problem background carefully, before beginning the data analysis. Adequate data analysis here will require you t
Homework 8: How the Recent Mammals Got
Their Size Distribution
36-402, Advanced Data Analysis
Due at 11:59 pm on Monday, 1 April 2013
Homeworks 4 and 6 used regression to study how the typical mass of (mammalian) species changes over evolution: on average
Homework 3: An Insuciently Random Walk
Down Wall Street
36-402, Advanced Data Analysis
Due at 11:59 pm on 4 February 2013
Instructions: Submit one PDF le containing your written
answers and all your gures and tables; the lename should include
your Andrew
Homework 6: How the Hyracotherium Got Its
Mass
36-402, Advanced Data Analysis
Due at 11:59 pm on Monday, 25 February 2013
Instructions: Submit a single PDF including all your written
responses and all your gures; include your Andrew ID in the le
name. Put
Homework 11: Growth and Debt
36-402, Advanced Data Analysis
Due at 11:59 pm on Monday, 29 April 2013
An important and controversial question in macroeconomics and political
economy is whether high levels of government debt causes the economy to grow
more
Homework 10: Brought to You by the Letters D,
A and G
36-402, Advanced Data Analysis
Due at 11:59 pm on Monday, 22 April 2013
The le sesame.csv contains data on an experiment which sought to learn
whether regularly watching Sesame Street caused an increas
Chapter 5
The Bootstrap
We are now four chapters into a statistics class and have said basically nothing about
uncertainty. This should seem odd, and may even be disturbing if you are very attached to your p-values and saying variables have signicant effe
Final Exam: The Union Makes Us Strong
36-402, Advanced Data Analysis
Due at 10 am on Monday, 9 May 2011
Instructions
You will be sent a data set (CSV format) by e-mail to your Andrew account.
Each data set is slightly dierent. Work only with your own. It
Homework 9: Patterns of Exchange
36-402, Advanced Data Analysis
Due at the start of class on Tuesday, 5 April 2011
There are many variables which inuence the rate at which one currency
is exchanged against another: changes in interest rates, changes in in
Homework 11: Use and Abuse of Conditioning
36-402, Advanced Data Analysis
Due at the start of class, 26 April 2011
1. (30 points) Refer to gure 1 in Homework 10.
(a) (5 points) Using the back door criterion, describe a way to estimate
the causal eect of s
Homework 7: Diabetes
36-402, Advanced Data Analysis
Due at the start of class, 22 March 2011
A classic data set for classication problems, logistic regression and related
methods comes from a study of the correlates of diabetes among the Pima
Indians of A
Homework 6: Nice Demo City, But Will It Scale?
36-402, Advanced Data Analysis
Due at the start of class, 21 February 2011
For data-collection purposes, urban areas of the United States are divided
into several hundred Metropolitan Statistical Areas based
Homework Assignment 5: Bootstrapping Will
Continue Until Morale Improves
36-402, Advanced Data Analysis
Due 15 February 2011
The goal of this homework is to practice using bootstrapping to quantify the
uncertainty in regression models.
The data set cats i