Chapter05 - Transformations and Weighting STAT 563 Spring...

Info icon This preview shows pages 1–25. Sign up to view the full content.

Transformations and Weighting STAT 563 Spring 2007
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Model Assumptions Common violations are: Expression for the expected value of Y is not correct The variance is not constant over the range of the data The data are not normally distributed One remedy to all these violations is to transform the data Reasonable to develop a model in terms of some function of the response Or transform the predictors
Image of page 2
Heteroscedasticity Constancy of error variance is one of the standard assumptions ( homoscedasticity) When the error variance is not constant, the error is said to be heteroscedastic Residuals tend to have a funnel-shaped distribution, either fanning out or closing in with the values of X
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Hypothetical Example Residuals X
Image of page 4
Heteroscedasticity We will learn how to Detect heteroscedasticity Its effects on the analysis Remove heteroscedasticity
Image of page 5

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Simple Example Number of injury incidents (y) and the proportion of total flights (n) for nine major airlines in a single year are given If f i denote the total flights for the i th airline, then the proportion of total flights n i made by the i th airline is If all airlines are equally safe, the injury incidents can be explained by the model = i i i f f n i i i n y ε β β + + = 1 0
Image of page 6
Image of page 7

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Image of page 8
Image of page 9

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Comments Residuals are seen to increase with n i Assumption of homoscedasticity seems to be violated Not surprising, injury incidents may behave as a Poisson variable which has a variance proportional to its mean Try square root transformation of the response Made the residual plot little better, still R 2 is still only 48% Consider other factors (besides proportion of total flights) for a better explanation of injury incidents
Image of page 10
Image of page 11

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Image of page 12
Transformations to stabilize variance Probability distribution of Y Var(Y) in terms of its mean μ Transformation Poisson μ Binomial μ(1-μ29/ n Negative Binomial μ+λ 2 μ 2 ) 1 ( + + Y Y or Y Y 1 sin - ) ( sinh 1 1 Y λ λ - - Here variance is a function of the mean response
Image of page 13

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Weighted Least Squares Used for stabilizing variance when the variance is a simple function of one of the predictors or is dependent on a known set of weights For example, based on some empirical evidence, standard deviation of residuals is proportional to X 0 , ) ( 2 2 = k x k Var i i ε
Image of page 14
Weighted Least Squares For a simple linear regression model, Divide both sides by x i and get Define new set of variables i i i x y ε β β + + = 1 0 i i i i i x x x y ε β β + + = 1 0 X X X X Y Y ε ε β β β β = = = = = * , , , 1 * , * 0 * 1 1 * 0
Image of page 15

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

WLS In terms of the new variables Note for the transformed model If our assumption regarding the variance holds, then we should work with the transformed model * * * 1 * 0 * i i i x y ε β β + + = t cons is Var k x k x Var x x Var Var i i i i i i i i tan ) ( ) ( 1 ) ( 1 ) ( ) ( * 2 2 2 2 2 * ε ε ε ε = = = =
Image of page 16
Example Study of 27 industrial establishments Number of supervisors (Y) Number of workers (X)
Image of page 17

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Image of page 18
Image of page 19

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Clear indication of increasing trend for residual variance with X
Image of page 20
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern