This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Stata Walkthrough #5: Some Empirical Tricks This walkthrough is going to explain some common tricks we use for modeling some situations. To do this, you will need to data on fuel efficiency ( car.dta ) from my website. Trick #1: including qualitative data as an explanatory variable. Sometime we have an explanatory variable that is qualitative; that is, it takes non numerical values. For example, we might want to know if there is a difference in the fuel efficiency of US made and foreign made vehicles; “ country of origin ” is a qualitative variable. We have this recorded in the variable origin in the database. Type tab origin to see its distribution: Place of  Origin:  1=US,  2=Europe,  3=Asia  Freq. Percent Cum. + 1  85 54.84 54.84 2  26 16.77 71.61 3  44 28.39 100.00 + Total  155 100.00 You should not simply include this variable in the regression. The numbers don ’ t mean anything numerical; they are simply codes for different categories. ( A value of “ two ” doesn ’ t measure anything; going from a 1 to a 2 or from a 2 to a 3 is not a one unit increase in anything real. ) Instead, you should create a “ dummy variable ” for the category. ( Remember that a dummy variable takes a value of one if some condition is satisfied, a value of zero otherwise. ) Let ’ s suppose that we ’ re interested in the fuel efficiency of US made cars, relative to the rest of the world. We would create a dummy variable for being US made: gen usmade = (origin == 1) Look at the distribution of this variable by typing tab usmade : usmade  Freq. Percent Cum.+ 0  70 45.16 45.16 1  85 54.84 100.00+ Total  155 100.00 This variable takes a value of 1 for the 85 observations that are US made; it takes a value of 0 for the other 70 car makes. You can include this variable in the regression. Type reg mpg usmade : Source  SS df MS Number of obs = 154 + F( 1, 152) = 60.32 Model  2365.50441 1 2365.50441 Prob > F = 0.0000 Residual  5961.24909 152 39.218744 Rsquared = 0.2841 + Adj Rsquared = 0.2794 Total  8326.75351 153 54.4232255 Root MSE = 6.2625  mpg  Coef. Std. Err. t P>t [95% Conf. Interval] + usmade  7.881125 1.014783 7.77 0.000 9.886026 5.876225 _cons  33.14348 .7539148 43.96 0.000 31.65397 34.63298  ( Ignore for now that you shouldn ’ t be doing the univariate regression. ) You can interpret the coefficient on usmade as “ the difference between the average fuel efficiencies of US made and foreign made vehicles...
View
Full
Document
This note was uploaded on 03/19/2010 for the course ECON 400 taught by Professor Turchi during the Spring '08 term at UNC.
 Spring '08
 turchi

Click to edit the document details