Time Series Data
STAT 563
Autocorrelation
The fundamental assumptions in linear
regression with regard to the errors are
Mean zero
Constant variance
Uncorrelated
That is,
E(i)=0, Var(i)=2, E(i j)=0
For the inference purpose, we make additional
norma
Logistic Regression
STAT 563
General Linear Models
Family of Regression Models
Outcomes variable determines the choice of
the model
Example
Data
AGE in years
Presence or absence of evidence of significant coronary
heart disease (CHD) for 100 subjects
Logistic Regression
STAT 563
1
General Linear Models
Family of Regression Models
Outcomes variable determines the choice of
the model
2
Binomial Distribution
Example
Assume 5% of the population has Coronary Heart
Disease (CHD). If we pick 500 people r
Nonlinear Regression
STAT 563
Regression Model
Recall that we can write the normal theory regression model as
y = f ( x, ) +
Where x is a n-vector of input variables, is a k-vector of
parameters, and the errors are independent N(0,2)
The assumption that
Robust Regression
STAT 563
Introduction
Under least squares regression for the model
Y=X+, we assume that ~N(0,2I)
Departures from this assumption can be
spotted in the behavior of residuals and can
be corrected, say, using transformations.
If the depa
Chapter 11
Multicollinearity
Recap
When the predictors are correlated or
express near-linear dependencies, we face
the problem of multicollinearity
The primary sources of multicollinearity
The data collection method employed
Constraints on the model or
Chapter 10
Validation of Regression Models
10.1 Introduction
What the regression equation was created
for, may not always be what it is used for.
Model Adequacy Checking Residual
analysis, lack of fit testing, determining
influential observations. Check
Chapter 9
Variable Selection and
Model Building
9.1 Introduction
9.1.1 The Model-Building Problem
Two conflicting goals in regression model building:
1. Want as many regressors as possible so that the
information content in the variables will influence y
Chapter 8
Indicator Variables
General Concept
Generally multiple regression
accommodates only quantitative variables
Dummy (or indicator) variables are useful
in incorporating qualitative (or categorical)
variables in the model
Simplest Case
Inclusion
Chapter 6
Diagnostics for Leverage
and Influence
Regression Analysis 4e Montgomery, Peck and V
1
6.1 Importance of Detecting Influential
Observations
Leverage Point:
unusual x-value;
very little effect
on regression
coefficients.
Regression Analysis 4e
Chapter 5
Transformations and Weighting
Model Assumptions
Common violations are:
Expression for the expected value of Y is not correct
The variance is not constant over the range of the
data
The data are not normally distributed
One remedy to all the
Chapter 4
Model Adequacy Checking
Regression Analysis 4e Montgomery,
Peck & Vining
1
4.1 Introduction
Assumptions
1. Relationship between response and regressors
is linear (at least approximately).
2. Error term, has zero mean
3. Error term, has constant
Chapter 3
Multiple Linear Regression
Regression Analysis 4e Montgomery,
Peck & Vining
1
3.1 Multiple Regression Models
Suppose that the yield in pounds of conversion in
a chemical process depends on temperature and
the catalyst concentration. A multiple
Tests on Subsets
Consider Y=X+ where
Y is nx1
X is nxp p=k+1
is px1
is nx1
Need to determine if some subset of r < k
predictors contributes significantly to the
regression model?
Tests on subsets
Partition
1 (p-r)x1
=
2 rx1
We wish to test
H0:2=
Chapter 2
Simple Linear Regression
Regression Analysis 4e Montgomery, Peck & Vi
1
2.1 Simple Linear Regression Model
Single regressor, x; response, y
y = 0 + 1x +
Population
regression model
0 intercept: if x = 0 is in the range, then 0 is the
mean of