5/10/10 Summary of Multiple Regression 2 2 Parameter Significance Sum of Squared Residuals Larger SSE = “noisier” data and less precise prediction Regression Sum of Squares Larger SSR = stronger model correlation Total Sum of Squares Larger SST = larger variability in y , due to “noisier” data ( SSE ) and/or stronger model correlation ( SSR ) 2 SSE e = ( 29 2 SST y y or SST SSR SSE = - = + ( 29 2 ˆ SSR y y = -
5/10/10 Relationship between F and R 2 3 3 R 2 in Multiple Regression: R 2 = fraction of the total variation in y accounted for by the model (all the predictor variables included) 2 1 SSR SSE R SST SST = = - F and R 2: By using the expressions for SSE , SSR , SST , and R 2, it can be shown that: So, testing whether F = 0 is equivalent to testing whether R 2 = 0 .

5/10/10 R 2 and Adjusted R 2 4 4 § Adding new predictor variables to a model never decreases R 2 and may increase it. § But each added variable increases the model complexity , which may not be desirable. § Adjusted R 2 imposes a “penalty” on the correlation strength of larger models, depreciating their R 2 values to account for an undesired increase in complexity: ( 29 2 2 1 1 1 1 adj n R R n k - = - - - - Adjusted R 2 permits a more equitable comparison between models of different sizes.
5/10/10 Chapter 23 5 5

5/10/10 Observational Studies A statistical study is observational when it is conducted using pre-existing data - collected without any particular design. Example : Many companies collect a variety of data via registration or warranty cards. This data might be utilized later in some observational study that seeks to discover correlations between the collected data. An observational study is retrospective if it studies an outcome in the present by examining historical records. 6 6
5/10/10 An observational study is prospective if it seeks to identify subjects in advance and collects data as events unfold. Example : Follow a sample of smokers and runners to discover the occurrence of emphysema. An experiment is a study in which the experimenter manipulates attributes of the study participants and observes the consequences. The attributes, called factors , are 7 7

5/10/10 The Four Principles of 1) Control. We control sources of variation other than the factors we are testing by making conditions as similar as possible for all treatment groups.
