Chapter05

Chapter05 - Transformations and Weighting STAT 563 Spring...

Info iconThis preview shows pages 1–25. Sign up to view the full content.

View Full Document Right Arrow Icon
Transformations and Weighting STAT 563 Spring 2007
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Model Assumptions Common violations are: Expression for the expected value of Y is not correct The variance is not constant over the range of the data The data are not normally distributed One remedy to all these violations is to transform the data Reasonable to develop a model in terms of some function of the response Or transform the predictors
Background image of page 2
Heteroscedasticity Constancy of error variance is one of the standard assumptions ( homoscedasticity) When the error variance is not constant, the error is said to be heteroscedastic Residuals tend to have a funnel-shaped distribution, either fanning out or closing in with the values of X
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Hypothetical Example Residuals X
Background image of page 4
Heteroscedasticity We will learn how to Detect heteroscedasticity Its effects on the analysis Remove heteroscedasticity
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Simple Example Number of injury incidents (y) and the proportion of total flights (n) for nine major airlines in a single year are given • If f i denote the total flights for the i th airline, then the proportion of total flights n i made by the i th airline is If all airlines are equally safe, the injury incidents can be explained by the model = i i i f f n i i i n y ε β + + = 1 0
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Comments • Residuals are seen to increase with n i Assumption of homoscedasticity seems to be violated Not surprising, injury incidents may behave as a Poisson variable which has a variance proportional to its mean Try square root transformation of the response Made the residual plot little better, still R 2 is still only 48% Consider other factors (besides proportion of total flights) for a better explanation of injury incidents
Background image of page 10
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 12
Transformations to stabilize variance Probability distribution of Y Var(Y) in terms of its mean μ Transformation Poisson μ Binomial μ(1-μ29/ n Negative Binomial μ+λ 2 μ 2 ) 1 ( + + Y Y or Y Y 1 sin - ) ( sinh 1 1 Y λ - - Here variance is a function of the mean response
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Weighted Least Squares Used for stabilizing variance when the variance is a simple function of one of the predictors or is dependent on a known set of weights For example, based on some empirical evidence, standard deviation of residuals is proportional to X 0 , ) ( 2 2 = k x k Var i i ε
Background image of page 14
Weighted Least Squares For a simple linear regression model, Divide both sides by x i and get Define new set of variables i i i x y ε β + + = 1 0 i i i i i x x x y + + = 1 0 X X X X Y Y = = = = = * , , , 1 * , * 0 * 1 1 * 0
Background image of page 15

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
WLS In terms of the new variables Note for the transformed model If our assumption regarding the variance holds, then we should work with the transformed model * * * 1 * 0 * i i i x y ε β + + = t cons is Var k x k x Var x x Var Var i i i i i i i i tan ) ( ) ( 1 ) ( 1 ) ( ) ( * 2 2 2 2 2 * = = = =
Background image of page 16
Example Study of 27 industrial establishments Number of supervisors (Y) Number of workers (X)
Background image of page 17

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 18
Background image of page 19

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Clear indication of increasing trend for residual variance with X
Background image of page 20
Fit Y/X on 1/X
Background image of page 21

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 22
Background image of page 23

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
> fitw <- lm(super~work,weights=1/ (work**2)) > e.res <- ls.diag(fitw)$stud.res > e.fit <- ls.diag(fitw)$fitted
Background image of page 24
Image of page 25
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/08/2009 for the course 960 563 taught by Professor Unknown during the Spring '07 term at Rutgers.

Page1 / 72

Chapter05 - Transformations and Weighting STAT 563 Spring...

This preview shows document pages 1 - 25. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online