Topics to be covered this week
Statistics 207
Winter Quarter, 2016
Monday, Jan 25 Three-factor studies (chap 22, App. Lin. Stat. Models).
Wednesday, Jan 26
Random effects model (chaps 25.1-25.4, App. Lin Stat. Models).
Homework 3 (Due on Monday, Jan Feb 1

Statistics 206
Homework 6
Due : November 24, 2014, In Class
1. Model Selection Procedures and Model Validation. Diabetes data. This data
consist of 19 variables on 403 subjects from 1046 subjects who were interviewed in a
study to understand the prevalenc

14.44) You definitely need to show sufficient work for getting the second partial derivatives to get credit if this problem is chosen to be graded. After getting the expressions, evaluate them in R to calculate the cov matrix. DO NOT simply use the vcov()

14.22
I wanted to clarify that x1^2:x2^2 is not a second order term. However, I included it because the student solution manual for your text includes 6 terms for this problem. The 6 terms are not defined and x1^2:x2^2 was the only thing I could think of

Here are some tips on the use of R for Homework 5.
(a) You may use lm.ridge command in MASS package. It will also give you the GCV values.
(b) For ridge regression it is possible to get VIF smaller than 1.
(c) Parts (c) and (d) of question 3: in order to

Handout 5
Unbalanced two-factor ANOVA
Consider the Hay Fever Relief data, but suppose that a few observations are missing and
this an example of an unbalanced two-factor study.
Factor B (ingredient 2
Factor A (ingredient 1)
j=1
j=2
j=3
i=1
2:4; 2:7; 2:3;

Handout 2
Pooling of sums of squares:
When the interactions eects can be ignored either because of prior knowledge or because of an F-test for
interactions, an alternative estimator of
model becomes: Yijk =
+
i
+
j
2
can be provided. Without the interacti

Handout 6
Three-factor ANOVA
We will deal with three factor models here. Note that once we know how to
analyze the two-factor case, this knowledge basically allows us to tackle three
or higher order studies, except that notations become increasingly messy

Topics covered this week
Statistics 207
Winter Quarter, 2016
Wednesday, Feb 17
Longitudinal data analysis (chap 9, Extending the Linear Model with R).
Homework 5 (Due on Wednesday, Feb 24).
(From App. Lin. Stat. Models) Problems 26.4, 26.5, 26.6, 26.19, 2

Handout 13
Ridge Regression
Consider the car price data which contain information on the cars sold in the U.S. in 1993. The response
variable is price and there are 6 other variable - city MPG, hwy MPG, engine size, HP, tank size and weight.
The goal is t

Handout 16
Logistic Regression.
So far we have only dealt withy the cases where the response variable is quantitative and the mathematical
assumption is that the response is a continuous variable. Let us look at two data sets. The rst data has a
binary re

Handout 14
Partial Least Squares
Partial least squares (PLS) is a method that is quite popular among many scientists. Like the ridge
regression, this method is useful when there is a substantial amount of collinearity among the independent
variables. It i

Handout 15
Lasso
Lasso is similar in spirit to the ridge regression except that the penalty is dierent. For the purpose of
discussion we will assume that all the variables have been standardized and call them Y; X1 ; : : : ; X6 . In ridge
regression we ha

Handout 12
Longitudinal Data Analysis.
Longitudinal data analysis is related to repeated measures designs. Consider a few examples below. In
each case, you have individuals or subjects being observed over time. For example, in the rat growth data,
each ra

Handout 1
Two-factor studies (balanced)
Consider a two-factor case where factor A has a levels, factor B has b levels, and there are n observations
for each of the ab combinations of the factors. For the Hay Fever Relief example, 9 compounds for hay fever

Handout 10
Repeated measures design
This design is often employed in practice when the same subject (or object) is subjected to a number of
treatments. It is a two-factor model in which one factor (subject) is usually treated as random whereas the
treatme

Handout 11
Repeated Measures and Split-plot designs.
This handout lists a few more repeated measures designs which are a bit more complicated than the ones
in Handout 6. In addition it also presents the basic ideas behind Split-Plot designs, which are qui

Handout 9
Nested Designs
Nested design (both factors xed):
Consider the data set given later. A company runs three schools for mechanics, one in each of the
following three cities: Atlanta (i=1), Chicago (i=2) and San Francisco (i=3). The instructors are:

Handout 7
Random eects model
We will consider a one-factor model where the factor eects are random. The following example is useful.
Coil winding machines: A plant contains a large number of coil winding machines. A production analyst
studied a certain ch

Handout 8
Random and Mixed Eects Models.
Two factor models (both factors random, ANOVA model II).
For this model we have two factors A and B, both random. An example (data on "miles per gallon") is
given later where both the factors diver (factor A) and d