Goodness of ﬁt (Cont’d) Review: sample correlation as a measure of goodness of ﬁt Second measure of goodness of ﬁt: Coeﬃcient of determination R 2 , it is based on a comparison of “variation accounted for” by the line versus “raw variation” of y . Ideas: The quantity n i =1 ( y i ¯ y ) 2 = n i =1 y 2 i 1 n ( n i =1 y i ) 2 = SST Total Sum of Squares is a measure for the variability of y (ﬁgure below). Notice SST is ( n 1) · s 2 y , where s 2 y is the sample variance of Y . 1

After ﬁtting the line ˆ y = b 0 + b 1 x , one doesn’t predict y as ¯ y anymore and suﬀer the errors of prediction above, but rather only the errors ˆ y i y i =: e i . 2
and n i =1 e 2 i = n i =1 ( y i ˆ y ) 2 = SSE : Sum of Squares of Errors is a measure for the remaining/residual/error variation. The fact is that SST SSE , so that SSR := SST SSE 0 is taken as a measure of ”variation accounted for” in the ﬁtting of the line.

