STAT5044: Regression and Anova
Inyoung Kim
Outline
1
Goal of Regression and Anova class
2
Course structure
2 / 21
Goal of this class
We will learn how to describe the relationship between two quantitative
(numerical) variables or among variables.
3 / 21
R
STAT5044: Regression and Anova
Inyoung Kim
Outline
1
Collinearity
Collinearity
A near-linear relationship (high correlation coefcient) among
covariates
Does not reduce bias much (because it can be explained roughly
by a linear combination of other covaria
STAT5044: Regression and Anova
Inyoung Kim
1 / 14
Outline
1
Polynomial Regression
2 / 14
Polynomial Regression (nonlinearity)
Using taylor series approximate polynomial function
mth order polynomial,
yi = 0 + 1 xi + 1 xi2 + + m xim + i ,
i N (0, 2 ).
The
STAT5044: Regression and Anova
Inyoung Kim
Outline
1
Regression
2
Simple Linear regression
3
Basic concepts in regression
4
How to estimate unknown parameters
5
Properties of Least Squares Estimators: Gauss-Markov theorem
2 / 47
Regression
A way to model
STAT5044: Regression and Anova
Inyoung Kim
Outline
1
Goal of Regression and Anova class
2
Course structure
2 / 21
Goal of this class
We will learn how to describe the relationship between two quantitative
(numerical) variables or among variables.
3 / 21
R
STAT5044: lab 8
Inyoung Kim
1 / 13
Outline
1
How to handle collinearity
2 / 13
Example
Car drivers like to adjust the seat position for their own comfort.
Car designers would nd it helpful to know whether different
drivers will position the seat depending
STAT5044: Lab4
Inyoung Kim
Outline
1
How to estimate WLS using R
Example
A health researcher, interested in studying the relationship between diastolic
blood pressure and age among healthy adult women 20 to 60 years odl,
collected data on 54 subjects. (Te
STAT5044: lab 2
Inyoung Kim
Outline
1
How to estimate the regression line and make inference
Example
A substance used in biological and medical research is shipped
by airfreight to users in cartons of 1,000 ampules.
The data, involving 10 shipments, were
STAT5044: lab 1
Inyoung Kim
Outline
1
How to estimate the regression line in R
Example
A substance used in biological and medical research is shipped
by airfreight to users in cartons of 1,000 ampules.
The data, involving 10 shipments, were collected on t
STAT5044: Regression and ANOVA, Fall 2012
Final Exam on Dec 17
Your Name:
Please make sure to specify all of your notations in each problem
GOOD LUCK!
1
Problem# 1. Answer each question.
A study was conducted to examine the effectiveness of a program usin
STAT5044: Regression and ANOVA, Fall 2013
Exam 2 on Dec 09
Your Name:
Please make sure to specify all of your notations in each problem
GOOD LUCK!
1
Problem# 1.
M1 : yi = 1 xi + i
M2 : yi = 0 + 1 xi + i
M3 : yi = 1 xi + gi i
where i [0, 2 ] and gi is know
STAT5044: Regression and ANOVA, Fall 2013
Final exam on Dec 16
Your Name:
Please make sure to specify all of your notations in each problem
GOOD LUCK!
1
Problem# 1.
Hastie and Tibshirani (1990) described a study to determine risk factors for kyphosis, sev
STAT5044: Regression and ANOVA, Fall 2011
Final Exam on Dec 14
Your Name:
Please make sure to specify all of your notations in each problem
GOOD LUCK!
1
Problem# 1.
Consider the following model,
2
yi = 0 + 1 x1i + 2 x1i + 3 x2i + i , i = 1, ., n
where E(i
STAT5044: Regression and Anova
Inyoung Kim
Outline
1
Matrix Expression
2
Linear and quadratic forms
3
Properties of quadratic form
4
Properties of estimates
5
Distributional properties
2 / 51
Matrix Expression
If we have p variables xi1 , . . . , xip for
STAT5044: Regression and Anova
Inyoung Kim
Outline
1
Regression
2
Simple Linear regression
3
Basic concepts in regression
4
How to estimate unknown parameters
5
Properties of Least Squares Estimators: Gauss-Markov theorem
2 / 47
Regression
A way to model
STAT5044: Regression and Anova
Inyoung Kim
Outline
1
Prediction
2 / 13
Prediction
Two meaning
Predict conditional mean of Y given a xnew :
We can use that estimation of conditional mean is 0 + 1 xnew .
Predict a new observation Y given a xnew
Y
= 0 + 1 xn
STAT5044: Regression and Anova
Inyoung Kim
1 / 25
Outline
1
Multiple Linear Regression
2 / 25
Basic Idea
An extra sum of squares the marginal reduction in the error sum
of squares when one or several predictor variables are added to
the regression model,