Introduction to Regression Models and Analysis of Variance
STATS 203

Spring 2011
Stat 203
[version: 3/5/12]
Homework 4.
Due in class on Tuesday March 13. Any corrections will be posted on Coursework.
Collaboration on homework problems is ne, but your write up should be your own.
1. [Simulation example to illustrate collinearity].
(j )
Stat 203
[version: 2/16/12]
Midterm, 2/16/12.
You may use a twosided sheet of paper with your own notes, but no other aids such as
lecture notes or books or wireless devices.
There are 8 questions, with the points value shown, for a total of 40.
1. [5 pt
In almost every domain of empirical research, showing a relationship between two
important variables is a key to advancing knowledge and to fame and fortune for the
discoverer.
Two examples:
The agricul
Experimental data. One or more of the dependent variables can be set (typically
by randomization ) by the experimenter. The logic of inference is based on the randomized
assignment.
Example. Blood coagulation times. Dataset comes from a study of blood coa
Statistics 203
1/10/12
Short Diagnostic Quiz
The purpose of this diagnostic quiz is to help the instructor and TAs understand the
background of the class. Related material will be briey reviewed in the next few classes.
It is anonymous* and does not count
Ignorance of how sample size affects statistical variation
has created havoc for nearly a millennium
Howard Wainer
W
hat constitutes a dangerous equation?
There are two obvious interpretations:
Some equations are dangerous if you know
them, and others are
2/9/12]
Cumulative list of exercises from Freedman.
The exercises designed to be a central part of Freedmans book. Most have solutions
starting on p. 235.
Several exercises will be assigned during each lecture while we are covering material
in Freedman. A
Sample Final Problem and Solution
Note: Convention is to include pieces of output needed to answer question and
attach the Rcode (without output) at the end of answer.
The Orings in the booster rockets used in space launching play an important part in p
Stat 203
[version: 3/15/12]
Final Exam.
Due: Thursday, March 22 at 10 P.M.
Before Thurdsay 5 p.m., you can turn in solutions to Angie Martinez in Sequoia 122
during oce hours. Between Thursday 7.30 p.m. and 10 p.m., turn in to Iain J. in Sequoia
138. (Bui
Statistics 203
Introduction to Regression and Analysis of
Variance
Assignment #1
Due Thursday, January 20
Prof. J. Taylor
Use R
for all calculations. Provide copies of your code in the
assignment.
Q. 1) (MP 2.7) The purity of oxygen produced by fractionat
Statistics 203
Introduction to Regression and Analysis of
Variance
Assignment #3
Due Tuesday, March 1
Prof. J. Taylor
Use R
for all calculations. Provide copies of your code in the
assignment.
Q. 1) (MP, 9.26) Consider the model
yi = 1 2 e3 xi + i ,
1 i n
Statistics 203
Introduction to Regression and Analysis of
Variance
Assignment #4
Due Thursday, March 10
Prof. J. Taylor
Use R
for all calculations. Provide copies of your code in the
assignment.
Q. 1) Consider the oneway random eects ANOVA model
Yij + i
Statistics 203
Introduction to Regression and Analysis of
Variance
Assignment #2
Due Thursday, February 10
Prof. J. Taylor
Use R
for all calculations. Provide copies of your code in the
assignment.
Q. 1) The dataset http:/wwwstat/jtaylo/courses/stats203/
Statistics 203
Introduction to Regression and Analysis of
Variance
Assignment #1 Solutions
January 20, 2005
Q. 1) (MP 2.7)
(a) Let x denote the hydrocarbon percentage, and let y denote the oxygen
purity. The simple linear regression model is y = 77.863 +
A Toy Example for regression with 2 variables.
The Delivery Data. MPV, Ch. 3. describe a simple data set with n = 25 observations
on a response Y and two predictor variables.
A soft drink bottler is analyzing the vending machine service routes in his dist
Fuel Consumption Data  condence intervals etc.
Lets return to the fuel consumption model we considered last time, as augmented with
the variable GradFreq.
> fgrad.lm < lm(Fuel~Tax + Dlic + Income + logMiles + GradFreq, data=fuel2001)
Estimate Std. Error
Fuel Consumption Data: illustrating case diagnostics
First, we compute (i) leverages, (ii) studentized (or jackknife) residuals to check for
outliers, and (iii) Cooks distances to check for inuential cases. None of the studentized
residuals exceed the (co
Comments on Midterm
1
Assumptions:
Y = X +
i
s are iid with mean zero and variance 2 .
X and are orthogonal.
X is full rank.
It is acceptable if you missed the last one. But several people added many assumptions (up to 7).
3
For rst two item, note that
Stat 203
[version: 2/23/12]
Homework 3.
Due in class on Tuesday Feb 28 [note delayed due date]. Any corrections will be
posted on Coursework.
Collaboration on homework problems is ne, but your write up should be your own.
1. [Prostate cancer surgery datas
Stat 203
[version: 1/31/12]
Homework 2 (complete).
Due in class on Thursday Feb 9. Any corrections will be posted on Coursework.
Collaboration on homework problems is ne, but your write up should be your own.
1. [R2 = r2 for simple linear regression.]
Con
Food Cost data and weighted least squares.
Another toy data set from MPV looks at the dependence of income (monthly) from food
sales on advertising expenses (annual) for 30 restaurants. A plot of (absolute) residuals
against AdCost shows that the variance