statawalk5

statawalk5 - Stata Walkthrough #5: Some Empirical Tricks...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Stata Walkthrough #5: Some Empirical Tricks This walkthrough is going to explain some common tricks we use for modeling some situations. To do this, you will need to data on fuel efficiency ( car.dta ) from my website. Trick #1: including qualitative data as an explanatory variable. Sometime we have an explanatory variable that is qualitative; that is, it takes non- numerical values. For example, we might want to know if there is a difference in the fuel efficiency of US- made and foreign- made vehicles; “ country of origin ” is a qualitative variable. We have this recorded in the variable origin in the database. Type tab origin to see its distribution: Place of | Origin: | 1=US, | 2=Europe, | 3=Asia | Freq. Percent Cum. ------------+----------------------------------- 1 | 85 54.84 54.84 2 | 26 16.77 71.61 3 | 44 28.39 100.00 ------------+----------------------------------- Total | 155 100.00 You should not simply include this variable in the regression. The numbers don ’ t mean anything numerical; they are simply codes for different categories. ( A value of “ two ” doesn ’ t measure anything; going from a 1 to a 2 or from a 2 to a 3 is not a one- unit increase in anything real. ) Instead, you should create a “ dummy variable ” for the category. ( Remember that a dummy variable takes a value of one if some condition is satisfied, a value of zero otherwise. ) Let ’ s suppose that we ’ re interested in the fuel efficiency of US- made cars, relative to the rest of the world. We would create a dummy variable for being US- made: gen usmade = (origin == 1) Look at the distribution of this variable by typing tab usmade : usmade | Freq. Percent Cum.------------+----------------------------------- 0 | 70 45.16 45.16 1 | 85 54.84 100.00------------+----------------------------------- Total | 155 100.00 This variable takes a value of 1 for the 85 observations that are US- made; it takes a value of 0 for the other 70 car makes. You can include this variable in the regression. Type reg mpg usmade : Source | SS df MS Number of obs = 154 -------------+------------------------------ F( 1, 152) = 60.32 Model | 2365.50441 1 2365.50441 Prob > F = 0.0000 Residual | 5961.24909 152 39.218744 R-squared = 0.2841 -------------+------------------------------ Adj R-squared = 0.2794 Total | 8326.75351 153 54.4232255 Root MSE = 6.2625 ------------------------------------------------------------------------------ mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- usmade | -7.881125 1.014783 -7.77 0.000 -9.886026 -5.876225 _cons | 33.14348 .7539148 43.96 0.000 31.65397 34.63298 ------------------------------------------------------------------------------ ( Ignore for now that you shouldn ’ t be doing the univariate regression. ) You can interpret the coefficient on usmade as “ the difference between the average fuel efficiencies of US- made and foreign- made vehicles...
View Full Document

This note was uploaded on 03/19/2010 for the course ECON 400 taught by Professor Turchi during the Spring '08 term at UNC.

Page1 / 9

statawalk5 - Stata Walkthrough #5: Some Empirical Tricks...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online