Qualitative Variable with k=4 The example in your book shows a qualitative variable with k=3 regions (A, B, C) and using the k-1 rule, the authors end up creating 2 dummy variables to code the qualitative variable. For your reading enjoyment, I have created the following example where k=4 (qualitative variable has 4 levels). If the variable REGION has k=4 levels (i.e., N, S, E, W), then use the k-1 rule to know you need 4-1=3 dummy variables to code REGION. Let's call those dummy variables: South, East, West (as they appear in the table below). Assume you have data related to 4 stores, including the region they are in, Age of the store, and store's annual sales. Since REGION is qualitative variable, we can't use that in a regression equation as an independent variable b/c
we MUST have numerical variables in regression analysis. Therefore, we code 3 dummy variables to represent the 4 levels of REGION. See sample data below. Store Region South East West Years built Sales (\$millions) 1 North 18 15 2 South 1 13 22 3 East 1 2 8 4 West 1 25 35 I arbitrarily selected North to not be a variable. Otherwise, I "could have" created the dummies to be North, East, and West, or any other combination. Also, the above is just a sample problem with a few observations so you can get a feel for how such data must be entered. If you try to run this problem in Excel you will get an error due to small sample size given we have so many independent variables. Dr. J.
