Economics 140A Regressor Specification We begin our exploration of the assumptions underlying the classic linear re- The regression model is correctly speci&ed. The correct speci&ca- tion is linear in the parameters (coe¢ cients) with an additive error term. To correctly specify a model we must: 1) select the correct independent variables 2) select the correct functional form 3) select the correct stochastic error distribution. focus on 1) today. To motivate how to select regressors, consider a model of primary and sec- ondary school teacher salaries. We begin with selection of qualitative regressors. Qualitative regressors capture qualitative characteristics, such as sex, race, union status, and are often indicator variables (that take the value 0 or 1). We will see that there is a distinction between regressions with qualitative regressors and cell averaging. Example. Y i - annual salary of teacher i , in thousands of dollars X i; 2 = 1 if teacher i is male ; 0 if teacher i is female ; Y i = 1 + 2 X i; 2 + U i : The expected salary of a female teacher is E ( Y i j X i; 2 = 0) = 1 and the expected salary of a male teacher is E ( Y i j X i; 2 = 1) = 1 + 2 :

(graph) The expected salaries from the regression model equal the cell averages, that is the average salary of females and the average salary of males. If we add an additional qualitative regressor, the results are altered. Let X i; 3 = 1 if teacher i is a union member 0 if teacher i is not a union member Y i = 1 + 2 X i; 2 + 3 X i; 3 + U i : The expected salaries are 1 for nonunion females 1 + 2 for nonunion males 1 + 3 for union females 1 + 2 + 3 for union males : Suppose that there are an equal number of individuals in each cell and the cell averages are 50 ; 000 male, union 20 ; 000 male, nonunion 50 ; 000 female, union 10 ; 000 female, nonunion : Our estimates are b 1 = 10 ; 000 the average wage of female, nonunion workers b 2 = 5 ; 000 the average, over union and nonunion workers, male wage premium b 3 = 35 ; 000 the average, over female and male workers, union wage premium : We see that our point estimates directly replicate only one cell. In fact, we would
