This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Assumptions Underlying Multiple Regression Analysis Multiple regression analysis requires meeting several assumptions . We will: (1) identify some of these assumptions; (2) describe how to tell if they have been met; and (3) suggest how to overcome or adjust for violations of the assumptions, if violations are detected. Some Assumptions Underlying Multiple Regression 1. No specification error 2. Continuous variables 3. Additivity 4. No multicollinearity 5. Normally distributed error term 6. No heteroscedasticity A. Absence of Specification Error This refers to three different things: 1. No relevant variable is absent from the model. 2. No unnecessary variable is present in the model. 3. The model is estimated using the proper functional form . A1. No relevant variable is absent from the model. There is no statistical check for this type of error. Theory and knowledge of the dependent variable (that is, the phenomenon of interest) are the only checks. The strength of causal claims is directly proportional to the adequacy and completeness of model specification. A2. No unnecessary variable is present in the model. Another, less serious, type of specification error is the inclusion of some variable or variables that are NOT associated with the dependent variable. You discover this when an independent variable proves NOT to be statistically significant. However, if theory and knowledge of the subject demanded that the variable be included, then this is not really specification error. Be careful not to remove statistically insignificant variables in order to reestimate models without them. This smacks of one of the sins of multiple regression analysis, stepwise regression. A3. Proper functional form . A third aspect of proper model specification relates to what is called the functional form of the analysis. Multiple regression analysis assumes that the model has been estimated using the correct mathematical function. Recall that our discussions of the line of best fit, etc., have all emphasized the idea that the data can be described by a straight line, however imperfectly, rather than by some other mathematical function. This is what functional form is all about. To determine whether the assumption of linear form is violated, simply create scatterplots for the relationship between each independent variable and the dependent variable. Examine each scatterplot to see if there is overwhelming evidence of nonlinearity . Use PROC PLOT for this: libname old 'a:\'; libname library 'a:\'; proc plot data=old.cities; plot crimrate *( policexp incomepc stress74 ); title1 'Plots of Dependent and All Independent Variables'; run;  A A  80 + A  A   A N  A U  M 70 + B  A E  A R  A  A A O  A F 60 + A A  A A A S  E  A A A R  A A I  A O 50 + A U  B AA S   C  A A A A R  A A I 40 + A A A M  A E  A A S  A  A AC P  A A E 30 + A A A R  A A A  1  ,  A 0  A A 0 20 + 0    A BA   10 +  AA++++++++ 10 20 30 40 50 60 70 80 POLICE EXPENDITURES PER CAPITA ...
View
Full Document
 Fall '07
 Velez
 Regression Analysis, multicollinearity, libname

Click to edit the document details