PPT10 Residual Analysis 2

PPT10 Residual Analysis 2 - McGill University Advanced...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
McGill University Advanced Business Statistics MGSC-272
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Residual Analysis II
Background image of page 2
More on outliers As we saw, an outlier may be due to an error in measurement or data entry. When an outlier is not due to an error but represents an accurate value, we have to investigate the following possible causes in the model itself: Omission of important variables in a model Omission of higher-order terms
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
If the outlier represents a real and plausible value in your data set (e.g. the annual income of a multimillionaire in a sample of people who work in the computer industry) it is necessary to decide how to handle the data. One possibility is to exclude the data from the statistical analysis and write an exception report to explain the absence of the extreme value from the data set.
Background image of page 4
Influential Observations and Leverage An influential observation is one whose removal would substantially affect the regression equation. Leverage is a measure of how influential an observation is: the larger the leverage value the more influence the observed y value has on its predicted value.
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ˆ i y 1 1 2 2 3 3 ˆ ... ... for 1,2,. .., i i i n n y h y h y h y h y h y i n = + + + + + + = In regression analysis it is known that the predicted value for the i th observation, , can be written as a linear combination of the n observed values y 1 , y 2 , …, y n. Thus, for each value y i , i = 1 to n, we have the equation Computing leverage The coefficient h i measures the influence of the observed y i value on its own predicted value . ˆ i y This value h i is called the leverage of the i th observation.
Background image of page 6
Minitab says… Leverage values provide information about whether an observation has unusual predictor values compared to the rest of the data. Leverages are a measure of the distance between the x- values for an observation and the mean of x-values for all observations. A large leverage value indicates that the x-values of an observation are far from the center of x- values for all observations. Observations with large leverage may exert considerable influence on the fitted value, and thus the regression model.
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Leverage values fall between 0 and 1. A leverage value greater than 3(k+1)/n , where k is the number of predictors (independent variables) and n is the number of observations, is considered large and should be examined. Minitab identifies observations with leverage over
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/28/2012 for the course MANAGEMENT MGSC 272 taught by Professor Smith during the Spring '12 term at McGill.

Page1 / 39

PPT10 Residual Analysis 2 - McGill University Advanced...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online