How different is our sample from the true
population due to chance?
Sampling variability: The variability among random samples
from the same population
A sampling distribution of a
A predictor variable may be redundant, in the sense that it
overlaps with other variables. That is, it can be predicted well by
other predictor variables and doesnt add much new to the model.
For example, suppose we have two species, gia
For a set of p-1 predictors, there are 2p-1 possible regression
models that can be constructed, based on the fact that each
predictor can be either included or excluded from a particular
For example, if there are 4 predictors, then
Recall our model for multiple regression is
Yi 0 1 X i1 2 X i 2 p 1 X i , p 1 i
And assuming that Ecfw_i = 0, the regression function is
Ecfw_Y = 0 + 1X1 + 2X2 + .+ p-1Xp-1.
To fit the model, we use the least squares method. The LS est
We will explain the general linear test approach in terms of the simple
linear regression model and testing Ho: 1 = 0 vs Ha: 1 0.
The general linear test approach involves three basic steps: the full
model, the reduced model,
Inferences on 1
~ t (n 2)
We use this t distribution of our test statistic to do hypothesis tests.
What value do we use for 1 in this statistic and why?
If we want to show there is a linear association between X and Y,
Prediction of mean
of new observations
Occasionally, we may want to predict the mean of m new
observations for a given level of a predictor variable.
For example, suppose the Toluca Company has been asked to
bid on a contract that calls for m=3 production
There are many situations were a single predictor variable in the model
would provide an inadequate description since a number of key
variables affect the response variable.
Furthermore, in these situations, predictions of the response
Simple Linear Regression
Yi 0 1 X i i
The response Yi is the sum of two parts: (1) the constant term 0 +
1Xi and (2) the random term i.
Thus Yi is a random variable.
Since Ecfw_i = 0 and 0 + 1Xi is a constant term, we have that
Ecfw_Yi Ecfw_ 0 1 X i i
The major uses of regression analysis are making inferences about
regression parameters, estimating the mean response for a given X,
and to predict a new observation Y for a given X.
We will now discuss some cautions
Lack of Fit
We use a lack of fit test to determine whether a specific type of
regression function adequately fits the data.
The test assumes that the observations Y for a given X are
independent and normally distributed and that the distributions
1. Of the last 500 customers entering a supermarket, 50 have
purchased a wireless phone. If the relative frequency
approach for assigning probabilities is used, the probability
that the next customer will purchase a wireless phone is
CELL PHONE USERS: GET ON WIFI, not on 4G!
Log on to Squarecap.com
What is one thing you hope to learn in this class?
Office hours begin this week
Day 1 Survey due Tuesday
Pay for your Squarecap subscription!
Put up a picture (of
Polynomial regression models have two basic types of uses:
When the true curvilinear response function is really a polynomial
When the true curvilinear response function is unknown (or
complex) but a polynomial
Given a fitted multiple regression model, a user may want to
compare the magnitudes of the effects of predictors on the response
For example, for a model for gas consumption of a car with the car
weight and horse
Sums of Squares
When we wish to test whether a predictor variable Xk has a linear
effect on Y, we test
Ho: k = 0 vs Ha: k 0
We already know that using a t-test is appropriate for this test.
Equivalently, we can use the general linear test approach,
We want to estimate parameters from our data. A point estimate is
a single number that is used to estimate the parameter.
An interval estimate is an interval of numbers around the point
estimate, within which the parameter value is believ
Like a confidence interval for a proportion, the confidence interval for a
mean has the form: point estimate margin of error.
The margin of error for is z1-/2* y(bar) , where
Thus, in the long run
y 1.96 n
Relations between variables
A functional relationship between two variables is expressed by a
For an independent variable X and a dependent variable Y, a
functional relationship has the form
For example, if we sell widgets fo
Suppose we have
Ho: It is not raining
Ha: It is raining
Now suppose everyone is soak and wet.
We then conclude that it is extremely unlikely that, if the null were
true and it was not raining, everyone would be soak and wet.
The Gauss-Markov theorem states that the least squares estimators
b0 and b1 are unbiased an have minimum variance among all
unbiased linear estimators.
Thus Ecfw_b0= 0 and Ecfw_b1= 1
It can be shown that b0 and b1 are lin
The maximum likelihood estimate (mle) of a parameter is the
parameter value for which the probability of observing our data is
For instance, say weights follow a normal distribution and the average
weight of a samp
Inferences on 1
We are assuming the normal error regression model
Yi 0 1 X i i
0 and 1 are parameters
Xi are known constants
i are independent N(0,2)
Frequently, we are interested in drawing inferences about 1, the
slope of the regression line.
Whenever data are obtained in a time sequence or some other type of
sequence, such as for adjacent geographic areas, it is a good idea to
prepare a plot of the residuals against time, called a sequence plot of
The purpose of plo
Estimation of Ecfw_Yh
A common objective is to estimate the mean for one or more
conditional probability distributions of Y. That is, to estimate the
mean of Y for a given value of X.
For example, the Toluca Company was interested in the mean
number of wo
Degrees of Freedom
Corresponding to the partitioning of the total sums of squares
SSTO, there is a partitioning of the associated degrees of freedom
There are n-1 df associated with SSTO. We lose a degree
because we are using the sample
Suppose we run a regression analysis and we want to create 95%
confidence intervals for both 0 and 1.
If we were to create two separate confidence intervals, they would not
provide 95% confidence that both 0 and 1 were in their res
General Linear Model
in Matrix Terms
In matrix terms, the general linear regression model is* (the
dimensions in the book are wrong)
1 X 11
1 X 21
1 X n1
X 12 X1,p-1
X 22 X 2,p-1
X n2 X n, p-1
for lack of fit
If a simple linear regression model is not appropriate for the data,
there are two basic choices
Use a more appropriate model
Use a transformation on the data so that the linear regression model
is appropriate for t