Homework 1
Due Sept. 9th
1. If fish lengths have a normal distribution with mean 20 and standard deviation of 10,
what is the sampling distribution of the sample mean for samples of size 400?
2. If the number of people who like Amys Icecream out of 200 is

General Linear
Test Approach
We will explain the general test approach in terms of the simple linear
regression model and testing Ho: 1 = 0 vs Ha: 1 0.
The general linear test approach involves three basic steps: the full
model, the reduced model, and the

Homework 1
Due Sept. 9th
1. If fish lengths have a normal distribution with mean 20 and standard deviation of 10,
what is the sampling distribution of the sample mean for samples of size 400?
The sample mean is distributed normal with mean 20 and standard

Homework 2
Due Sept 16
1) You want to know whether adults in your country think the ideal number of children
is 2 or if adults in your country think the ideal number of children is either higher or
lower than 2.
(a) Define notation and state you hypothese

Homework 3
Due Sept 26
For any problem that requires using SAS, turn in your code and output. You must
also indicate what your answers are, not just turn in SAS output. For all hypothesis
tests, assume alpha=.05 unless it is stated otherwise.
Use SAS to d

Homework 2
Due Sept 16
1) You want to know whether adults in your country think the ideal number of children
is 2 or if adults in your country think the ideal number of children is either higher or
lower than 2.
(a) Define notation and state you hypothese

Generalized Linear Models
Generalized linear models, GLMs, describe patterns of association
and interaction.
The models help us evaluate which explanatory variables affect the
response, while controlling for effects of possible confounding
variables.
For

Logistic Regression Models
The odds of success is the probability of success divided by the
probability of failure:
1
And
log
x
1
1
X
exp( x) exp( ) exp( )
Logistic Regression Models
An odds ratio is the ratio of two odds. Here we look at the od

Weighted Least
Squares
If the error terms are normally distributed but the variance of the
error term is not constant, a standard remedial measure is to use
weighted least squares.
When using weighted least squares, the regression model assumes
that the v

Polynomial
Regression Models
Polynomial regression models have two basic types of uses:
1)
When the true curvilinear response function is really a polynomial
function.
2)
When the true curvilinear response function is unknown (or
complex) but a polynomial

Model Selection
For a set of p-1 predictors, there are 2p-1 possible regression
models that can be constructed, based on the fact that each
predictor can be either included or excluded from a particular
model.
For example, if there are 4 predictors, then

Multiple Regression
There are many situations were a single predictor variable in the model
would provide an inadequate description since a number of key
variables affect the response variable.
Furthermore, in these situations, predictions of the response

Homework 4
Due Oct. 12th
For any problem that requires using SAS, turn in your code and output. You must
also indicate what your answers are, not just turn in SAS output. For all hypothesis
tests, assume alpha=.05 unless it is stated otherwise.
1) Refer t

Homework 5
Due Oct. 19th
For any problem that requires using SAS, turn in your code and output. You must
also indicate what your answers are, not just turn in SAS output. For all hypothesis
tests, assume alpha=.05 unless it is stated otherwise.
1) For con

Homework 6
Due Oct. 24th
For any problem that requires using SAS, turn in your code and output. You must
also indicate what your answers are, not just turn in SAS output. For all hypothesis
tests, assume alpha=.05 unless it is stated otherwise.
1. We are

Homework 7
Due Nov. 9th
1)
For the homework6 data (the same SAS file as for last homework)
Calculate the adjusted coefficient of multiple determination for the linear model and the
quadratic model. You can use the output from the ANOVA table in SAS to plu

Homework 8
Due Nov. 23rd
For any problem that requires using SAS, turn in your code and output. You must
also indicate what your answers are, not just turn in SAS output. For all hypothesis
tests, assume alpha=.05 unless it is stated otherwise.
1) For the

Homework 8
Due Nov. 23rd
For any problem that requires using SAS, turn in your code and output. You must
also indicate what your answers are, not just turn in SAS output. For all hypothesis
tests, assume alpha=.05 unless it is stated otherwise.
1) For the

Homework 5
Due Oct. 19th
For any problem that requires using SAS, turn in your code and output. You must
also indicate what your answers are, not just turn in SAS output. For all hypothesis
tests, assume alpha=.05 unless it is stated otherwise.
1) For con

Homework 4
Due Oct. 12th
For any problem that requires using SAS, turn in your code and output. You must
also indicate what your answers are, not just turn in SAS output. For all hypothesis
tests, assume alpha=.05 unless it is stated otherwise.
1) Refer t

Homework 3
Due Sept 26
For any problem that requires using SAS, turn in your code and output. You must
also indicate what your answers are, not just turn in SAS output. For all hypothesis
tests, assume alpha=.05 unless it is stated otherwise.
Use SAS to d

Homework 7
Due Nov. 9h
1)
For the homework6 data (the same SAS file as for last homework)
Calculate the adjusted coefficient of multiple determination for the linear model and the
quadratic model. You can use the output from the ANOVA table in SAS to plug

Standardized
Multiple Regression
Given a fitted multiple regression model, a user typically wants to
compare predictors in terms of the magnitudes of their effects on the
response variable.
For example, given a model for gas consumption of a car with the

Multiple Regression
Recall our model for multiple regression is
Yi 0 1 X i1 2 X i 2 p 1 X i , p 1 i
And assuming that Ecfw_i = 0, the regression function is
Ecfw_Y = 0 + 1X1 + 2X2 + .+ p-1Xp-1.
To fit the model, we use the least squares method. The LS est

Inferences on 1
We are assuming the normal error regression model
Yi 0 1 X i i
where
0 and 1 are parameters
Xi are known constants
i are independent N(0,2)
Frequently, we are interested in drawing inferences about 1, the
slope of the regression line.
Infe

How different is our sample from the
true population due to chance?
Population
Sample
Sampling variability
Sampling variability: The variability among random samples
from the same population
Population
Sample1
Sample2
Sample3
A sampling distribution of a

Inferences on 1
Recall that
t*
b1 1
~ t (n 2)
scfw_b1
We use this t distribution of our statistic to do hypothesis tests.
What value do we use for 1 in this statistic and why?
If we want to show there is a linear association between X and Y,
what are ou

Simple Linear Regression
Yi 0 1 X i i
The response Yi is the sum of two parts: (1) the constant term 0 +
1Xi and (2) the random term i.
Thus Yi is a random variable.
Since Ecfw_i = 0 and 0 + 1Xi is a constant term, we have that
Ecfw_Yi Ecfw_ 0 1 X i i

Estimation of
regression function
The Gauss-Markov theorem states that the least squares estimators
b0 and b1 are unbiased an have minimum variance among all
unbiased linear estimators.
Thus Ecfw_b0= 0 and Ecfw_b1= 1
It can be shown that b0 and b1 are lin