# Chapter_4 - Stat 373 Ch 4 1 Chapter 4 Assessing Model Fit...

This preview shows pages 1–3. Sign up to view the full content.

Stat 373 – Ch. 4 - 1 Chapter 4 Assessing Model Fit To this point, we have built, fit and used a model for a given set of data without questioning any of the underlying assumptions. In this chapter, we examine the problem of model fit. Are the assumptions reasonably well met and, if not, what do we do about it? In fitting the model yx and using the corresponding estimators to construct formal statistical procedures, we are making a number of assumptions about the underlying probability model x pp =++ + + \$\$ .... \$ \$ ββ β 01 1 1 r I Yx x R R N =+ + + + σ 1 2 10 , ~ ( , ) For example, we are assuming that: the mean vector E Y () is the specified linear function of the explanatory variates the residuals are gaussian, independent with constant standard deviation for each unit in the sample We can assess these assumptions in several ways. If we have units in the sample in which the explanatory variates are identical, we can use ANOVA to assess the fit. Also we can add extra terms (squares, cross products etc.) to the proposed model and test if the additional terms have significant effects. If not, then we have greater confidence in the form of the mean function in the original model. Looking at the Estimated Residuals We also assess fit by looking for patterns that would be unusual if the model is “true”. If we find such patterns, we are suspicious about the assumptions underlying the model. This approach to assessing fit is informal and subjective – we need to be careful not to over-interpret the plots looking for patterns. The estimated residuals, the components of the vector \$ r , are derived from the given model. \$ \$ \$ ry yX =−=− μβ The corresponding estimator ~ ~ () rYX IH R =− = − is a linear combination of the components of and hence, according to the model, ~ R ~( , ( ) rN IH 0 2 ) . Recall that depends only on X . We also know that HX X XX t = 1 t \$ r and \$ μ are orthogonal and, according to the model, ~ r and ~ are independent . If we plot the individual components, the estimated residual \$ r i versus the fitted value \$ i for i , we should see a plot with no obvious patterns. n = 1, . .., Adapted from Stat 371 Course Notes © R.J. MacKay University of Waterloo, 2005

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Stat 373 – Ch. 4 - 2 Example 1 Consider again the assessment data discussed in previous chapters and found in the file assessment.txt. If we fit a model with 5 explanatory variates size, age, office, ratio and location to the measured value, we can create a plot of the estimated residuals versus the fitted values with the R code b<-lm(value~size+age+ratio+office+location) plot(fitted(b),resid(b)) Does this plot raise any suspicions about the proposed model? The answer is yes since it would surprising (assuming that the model is correct) if the two largest estimated residuals correspond to the two largest fitted values as seen in the plot. The remedy here is to repeat the fitting and analysis with these cases removed to see if the conclusions are substantially effected. If they are influential, then we need to decide (not on statistical basis) how to proceed. Otherwise, we can ignore the poor fit.
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 05/03/2011 for the course ECON 202 taught by Professor Na during the Spring '11 term at University of Toronto.

### Page1 / 10

Chapter_4 - Stat 373 Ch 4 1 Chapter 4 Assessing Model Fit...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online