Lecture 520thSept 2018
ANOVASource ofVariabilityDegrees ofFreedom (df)Sum of SquaresMean SquareF-RatioP-valueRegression(Model)(n-1)-(n-2)=1???= ෍?=1𝑛(ො𝜇?− ത?)2𝑀?? =???1? =𝑀??𝑀??Residual(Error)n-2???= ෍?=1𝑛𝑒?2= ෍?=1𝑛(??− ො𝜇?)2𝑀?? =???𝑛 − 2Total(corrected)n-1???= ෍?=1𝑛(??− ത?)2
Why is this F ratio informative??=???𝜎21???𝜎2(𝑛 − 2)=𝑴??𝑴??=(explained variation)(unexplained variation)
Coefficient of determination:??Consider againSST = SSR + SSEThe ratio??=??????=(explained variation)(Total variation)= ? −??????This is also used to assess the “fit” of the regression model. Inparticular it gives us theproportion of the total variation in theresponse that is explained by the regression model.0 ≤ ?2≤ 1, ?2= 0indicates that none of the variability in?isexplained by the linear regression model. Hence the higher?2thebetter.
Adjusted??:Adding a new explanatory variable to the model will always keep the?2value the same or increase it. But just because the value hasincreased, this does not mean that the addition of the variable isresulting in a better model (interpret with caution).Adjusted??imposes a “penalty” for each new variable that is addedto the model in an effort to make models of different sizescomparable.The adjusted?2can decrease when a new variable isadded to the model, or increase when a variable is removed from themodel.?𝒂𝒅𝒋?=?2𝑝𝑛 − 1𝑛 − 1𝑛 − 𝑝 − 1=? −???/(𝒏 − 𝒑 − ?)???/(𝒏 − ?)
ExampleA car dealer is interested in modeling the relationship between theweekly number of cars sold and the daily average number ofsalespeople who work on the showroom floor during that week.

