# Q3 - But if we add a variable whose t-statistic is less...

the Akaike Information Criterion (AIC) R Squared and Adjusted R Squared The R Squared is R2 = 1 – ESS / TSS. It is 1 if and only if ESS = 0, i.e. there are no errors… y = yhat… the fit is exact at every data point. In addition, there is a crucial fact pertaining to the use of R Squared as a measure of improvement: if we add a variable to a regession, the R Squared cannot decrease. It doesn’t have to go up, but it cannot get smaller. In other words, as a measure of improvement, R Squared says we’re never worse off adding a variable. It can never recommend against adding a variable (until we have k > n, at which point the regression will fail, because X’X cannot be inverted). The Adjusted R Squared is Adjusted R Squared = 1 – (ESS/n-k) / (TSS/n-1). As I have said before, it can be interpreted as subtracting the ratio of two estimates of variance. Like the R Squared, it is 1 if there are no errors.

Unformatted text preview: But if we add a variable whose t-statistic is less than 1 (in absolute value), then the Adjusted R Squared will decrease. It does balance complexity (more variables in the model) against smaller total squared error. In contrast to every other measure we will see here, we want its maximum rather than its minimum. The AIC The AIC, as Mathematica uses it, appears to be defined as It is more than convenient to use rules at this point, so that I can write symbolic equations involving n, k, and ESS without having Mathematica use the numerical values of them. Let me clear the numerical values… set some rules… then ask for the AIC for the regression (our main one, the Hald data with X1, X2, and X4, which can be found here )… and finally compute it directly using the equation… Additional Selection Criteria...
