This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: STA 4107/5107 Chapter 4 Regression Analysis 1 Key Terms Please review and learn these terms. 2 What is Multiple Regression Multiple regression refers to regression with multiple explanatory variables (but just one response variable). Technically, it’s not a multivariate technique unless the response variable is multivariate, but it is a technique that can help us make sense of a large number of variables. Multiple regression is an amazingly flexible tool which can be used to model linear and nonlinear relationships. Don’t be fooled by the “linear” in “linear regression”: you may have already seen how simple linear regression can be used to model nonlinear relationships by transforming one or both of the explanatory and response variables. There are more ways using multiple regression. It’s even possible to incorporate categorical variables into multiple regression models. Regression analyses are used for studying the relationship between a response or depen- dent variable, usually denoted Y and plotted on the vertical axis, and a set of predictor or explanatory variables, usually denoted X i and plotted on the horizontal axis. Researchers usually have one of two goals when approaching data with a regression anal- ysis – descriptive/mechanistic, where they would like to illucidate underlying mechanisms; or predictive, where they would like to be able to predict one variable using the others. However, in my opinion this is a false dichotomy, in most cases. We should almost always be testing a mechanistic model by testing how predictive it is. 2.1 Examples 1. One explanatory variable, with a quadratic term: μ ( Y | X ) = β + β 1 X + β 2 X 2 We can include higher order powers of X, although this is unusual unless there is a theoretical reason for it. Note: we always include lower order terms when a higher order term is in a model. For example, we always include X if X 2 is in the model. 2. Two or more explanatory variables: μ ( Y | X ) = β + β 1 X + β 2 X 2 + ··· + β p X p 1 3. Two (or more) explanatory variables with an interaction: μ ( Y | X ) = β + β 1 X + β 2 X 2 + ··· + β p X p + β p +1 X 1 X 2 The term X 1 X 2 is the product of the two variables. We’ll see why this is called an interaction below. 4. Categorical explanatory variables: The explanatory variables can be binary (0,1). In fact, the ANOVA and pooled two-sample t models can be written as special cases of the linear regression model. 3 SMSA Data Set For this chapter we will consider a data set that contains environmental and social variables on US cities and will try to select the best model for predicting mortality. Properties of 60 Standard Metropolitan Statistical Areas (a standard Census Bureau designation of the region around a city) in the United States, were collected from a variety of sources....
View Full Document
- Spring '08
- Regression Analysis