{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

PPT10 Residual Analysis 2

# PPT10 Residual Analysis 2 - McGill University Advanced...

This preview shows pages 1–9. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Residual Analysis II
More on outliers As we saw, an outlier may be due to an error in measurement or data entry. When an outlier is not due to an error but represents an accurate value, we have to investigate the following possible causes in the model itself: Omission of important variables in a model Omission of higher-order terms

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
If the outlier represents a real and plausible value in your data set (e.g. the annual income of a multimillionaire in a sample of people who work in the computer industry) it is necessary to decide how to handle the data. One possibility is to exclude the data from the statistical analysis and write an exception report to explain the absence of the extreme value from the data set.
Influential Observations and Leverage An influential observation is one whose removal would substantially affect the regression equation. Leverage is a measure of how influential an observation is: the larger the leverage value the more influence the observed y value has on its predicted value.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
ˆ i y 1 1 2 2 3 3 ˆ ... ... for 1,2,..., i i i n n y h y h y h y h y h y i n = + + + + + + = In regression analysis it is known that the predicted value for the i th observation, , can be written as a linear combination of the n observed values y 1 , y 2 , …, y n. Thus, for each value y i , i = 1 to n, we have the equation Computing leverage The coefficient h i measures the influence of the observed y i value on its own predicted value . ˆ i y This value h i is called the leverage of the i th observation.
Minitab says… Leverage values provide information about whether an observation has unusual predictor values compared to the rest of the data. Leverages are a measure of the distance between the x- values for an observation and the mean of x-values for all observations. A large leverage value indicates that the x-values of an observation are far from the center of x- values for all observations. Observations with large leverage may exert considerable influence on the fitted value, and thus the regression model.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Detecting influence with leverage Leverage values fall between 0 and 1.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 39

PPT10 Residual Analysis 2 - McGill University Advanced...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online