lecture4 - ISYE6414 Summer 2010 Lecture 4 The Linear Model...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
ISYE6414 Summer 2010 Lecture 4 The Linear Model: OLS Regression Some Diagnostics and Etc. Dr. Kobi Abayomi June 3, 2010 1 Diagnostics The theme of diagnostics in the linear regression model is departures from the assumptions, which are chiefly: Normality of the error, constancy of the error ( homoscedasticity ), independence of the error. Violations of these assumptions are realized as non-linearity of the regression function and via exclusion of latent predictors, as well. The first thing you should always do is make plots of the data! Remember that a linear regression is a linear model: in many cases the relationship in the data is not linear; or not linear for all of the data. 1.1 Checking Observations: Predictors and Response Any data point that is far from the majority of the data (in both x and y ) is called an outlier . An outlier is an observation that is unusually small or large. Remember that our estimates for the regression parameters came from our attempt to fit the mean line through the y ’s at each value of x . Data points that are far from the mean of the x ’s are called leverage points . A data point that is far from the the mean of both the y ’s and the x ’s are often influential and can change the value of the estimated parameters significantly. The upshot: Often a regression line will differ greatly depending upon what subsets of the data are included. Sometimes there are good reasons for excluding subsets (there were errors in the data entry; there were errors in the experiment). Sometimes - the outlier belongs in the data. Outliers should always be examined. 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
1.2 In R ###outlier example IQ<-c(90,90,110,120,80,130,125,95,100) shoesize<-c(8.25,8.5,12,12,7.75,12.5,12,9.5,9.7) ###The original shoe IQ data ###I add an outlier, that is influential shoesize2<-c(shoesize,20) IQ1<-c(IQ,115) ###to the original shoe IQ data ###I add an outlier that is not influential shoesize2<-c(shoesize,15) IQ2<-c(IQ,140) ###all of the data with the affected regression lines shoesize3<-c(shoesize,20,15) IQ3<-c(IQ,115,140) shoeIQdata<-as.data.frame(cbind(shoesize3,IQ3)) names(shoeIQdata)<-c("shoesize","IQ") regline1<-lm(IQ ~ shoesize,data=shoeIQdata[1:9,]) regline2<-lm(IQ ~ shoesize,data=shoeIQdata[1:10,]) regline3<-lm(IQ ~ shoesize,data=shoeIQdata[-10,]) plot(shoesize3,IQ3,type="n") points(shoesize3[1:9],IQ3[1:9],col="black") points(shoesize3[10],IQ3[10],col="red") 2
Background image of page 2
points(shoesize3[11],IQ3[11],col="green") lines(shoesize3[1:9],regline1$fitted,col="black",lwd=2) lines(shoesize3[1:10],regline2$fitted,col="red",lwd=2) lines(shoesize3[-10],regline3$fitted,col="green",lwd=2) Some diagnostic plots you can use Dotplot par(mfrow=c(2,1)) dotchart(shoeIQdata[,1],main="shoesize-predictor") dotchart(shoeIQdata[,2],main="IQ-response") Stem and Leaf plot stem(shoeIQdata[,1]); stem(shoeIQdata[,2]) Box Plot boxplot(shoeIQdata[,1],main="shoesize-predictor"); boxplot(shoeIQdata[,2],main="IQ-response",horizontal=T)
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 22

lecture4 - ISYE6414 Summer 2010 Lecture 4 The Linear Model...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online