# lec4 - IV. Selecting Variables How do we go about selecting...

This preview shows pages 1–11. Sign up to view the full content.

IV. Selecting Variables

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
How do we go about selecting variables for regression models? In fact, we’ve already spent considerable time on this topic (questions of causality within a multivariate framework). Most fundamentally, we should include variables only because within a sound conceptual framework: (1) We want to find out how they effect the dependent variable. Or (2) We want to control for their effects on the dependent variable.
So, we include independent variables only within sound conceptual frameworks that lead us to hypothesize that the variables: (1) Have causal effects on the dependent variable. (2) Are correlated with each other. (see Allison, pages 49-52)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Let’s keep in mind that properly conducted, randomized experimental design automatically imposes controls. That is, it automatically ensures that there’s no correlation between the treatment variable and the characteristics of the subjects. (see Allison, page 50)
Today we’ll introduce some variable-selection procedures that most of us do not recommend using (see, e.g., Allison, pages 92-93; Mendenhall More important, we’ll then examine a non- automated, conceptually guided & systematic approach to selecting variables—the way we should do things.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Automated Procedures What if we have lots of potential explanatory variables but we have no clear reasons to guide us in selecting them for a model? An automated approach to the problem is stepwise regression : sw regress y x1 x2 x3…xk, options . use stepwise, clear
. corr x* . collin x* [a set of collinearity statistics] Forward stepwise selection: . sw regress y x1 x2 x3 x4 x5 x6, pe(.99) Set ‘parameter entry’ to .99 so that all the variables will enter & their p-value order can be observed.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
. sw regress y x1 x2 x3 x4 x5 x6, pe(.25) Set ‘parameter entry’ to .25 so only the variables with p-value<=.25 will be retained. Logic of forward stepwise selection: (1) Use stepwise to fit a model of y on the constant. (2) Stepwise adds x1, then x2, then…x6. (3) Stepwise finds the x-variable of the series that is most significant statistically. In our example, if a variable’s significance is <=.25, stepwise keeps it in the model. pe : ‘eligible for addition’
Backward stepwise selection: . sw regress y x1 x2 x3 x4 x5 x6, pr(.99) Set ‘parameter entry’ to .99 so that all the variables will enter & their p-value order can be observed. . sw regress y x1 x2 x3 x4 x5 x6, pr(.25) Set ‘parameter entry’ to .25 so only the variables with p-value<=.25 will be retained. pr : ‘eligible for removal’

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Logic of backward stepwise selection: (1) Use stepwise to fit a model of y on x1…x6. (2) Stepwise considers dropping x1, then x2, then…
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 07/11/2011 for the course SYA 6306 taught by Professor Tardanico during the Spring '09 term at FIU.

### Page1 / 36

lec4 - IV. Selecting Variables How do we go about selecting...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online