This preview shows pages 1–14. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: McGill University Advanced Business Statistics MGSC272 Regression Pitfalls Identify and correct potential problems in MR Observational vs Experimental Data Observational Data: Values of the independent variable are uncontrolled. Experimental Data: Values of the independent variable are controlled via a designed experiment. Advertising Example The variables ADVERT, SPECIAL and PLACE are experimental variables. The general manager decides how much adverting budget to allocate to each store, and how many inhouse specials to assign. He also selects stores in Montreal and Toronto. The variables SIZE is observational we assume that the stores sampled are already in business and the manager has no control over the size of the random sample of stores selected in the two locations. Advantages of Using Experimental Variables The user controls the experiment. Variable values can be assigned in such a way that independent variables are not correlated so that Multicollinearity can be eliminated. Cause and effect relationships can be inferred. Randomization can be controlled by assigning a desired range of values to the independent variables. Observational Data Warning When using observational data it is harder to establish a causal relationship between a response variable y and a predictor variable x. Deviating from the Assumptions 1 1 2 2 Recall that the multiple regression model is ... k k y x x x = + + + + + 2 2 i The model assunptions are: 1. ( ) 2. is normally distributed 3. (a constant) 4. are statistically independent E = = In summary, 2 i are independent (0, ) variables. N Corrective measures when assumptions are not satisfied Serious violations of any of the four assumptions will invalidate inferences drawn from the regression model. In future chapters we will discuss specific actions that can be taken to correct departures from the assumptions. For example, when the assumption of homogeneity of variance ( 2 is a constant) is violated a transformation of the data (e.g. a square root, reciprocal, or exponential transformation of one or more variables) may correct the problem. For now, it is sufficient to realize that it is important to check the assumptions and take corrective action if necessary before using a regression model to make inferences. Interpreting regression coefficients Recall that for the regression coefficient b i the standardized regression coefficient b i * is given by: * where and are the standard deviations of the and variables, respectively. i i x i i y x y i s b b s s s x y = MODEL: Profit vs Size and Place Calculating the standardized regression coefficients in Minitab We choose the following option from the Minitab menu: Stat => Regression => Partial Least Squares The Storage Option Coefficients and standardized coefficients in C4 and C5...
View Full
Document
 Spring '12
 smith

Click to edit the document details