day2prlm

# day2prlm - PADP 8130: Linear Models Simple Linear...

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: PADP 8130: Linear Models Simple Linear Regression Model PRACTICE Angela Fer7g, Ph.D. OLS in Stata . use "/Users/afertig/Documents/Teaching/PADP8130_Spring2012/day2/day2.dta”! . drop if faminc>=200000! (335 observations deleted)! . drop if faminc==0! (52 observations deleted)! ! . reg faminc educhd! ! Source | SS df MS Number of obs = 7910! -------------+-----------------------------F( 1, 7908) = 1180.26! Model | 1.8083e+12 1 1.8083e+12 Prob > F = 0.0000! Residual | 1.2116e+13 7908 1.5321e+09 R-squared = 0.1299! -------------+-----------------------------Adj R-squared = 0.1298! Total | 1.3924e+13 7909 1.7605e+09 Root MSE = 39142! ! ------------------------------------------------------------------------------! faminc | Coef. Std. Err. t P>|t| [95% Conf. Interval]! -------------+----------------------------------------------------------------! educhd | 5901.34 171.7755 34.35 0.000 5564.614 6238.065! _cons | -18893.11 2260.571 -8.36 0.000 -23324.42 -14461.79! ------------------------------------------------------------------------------! 1 Thinking about OLS graphically ScaFer plots 0 50000 100000 150000 ! 200000 . set seed 5000! . bsample 50! . Scatter faminc educhd || lfit faminc educhd! 8 10 12 14 COMPLETED ED-HD Total family income in 2008 16 18 Fitted values Eye- balling it: Y- intercept is nega7ve and the slope is about 50000/8=6250 5.0e-06 Density 1.0e-05 1.5e-05 Another useful graph: Histograms 0 . clear! . use day2.dta! . drop if faminc>=200000! (335 observations deleted)! 0 . drop if faminc==0! (52 observations deleted)! . histogram faminc! (bin=39, start=0, width=5115.3846)! 50000 100000 150000 Total family income in 2008 200000 2 Can change the number of bins 0 5.0e-06 Density 1.0e-05 1.5e-05 . histogram faminc, bin(10)! (bin=10, start=0, width=19950)! 0 50000 100000 Total family income in 2008 150000 200000 A line/smooth version of the histogram is the kernal density graph Kdensity faminc! Density 5.000e-06 .00001 .000015 Kernel density estimate 0 •  0 50000 100000 150000 Total family income in 2008 200000 kernel = epanechnikov, bandwidth = 6.1e+03 3 Let’s look at unbiasedness with fancy Stata simula7ons •  Let’s assume that our PSID 2009 sample is the true popula7on Use day2.dta! •  We’ll take head’s educa7on and make a “true” rela7onship between educa7on and income keep educhd! set seed 5000! gen err=rnormal(-2500,2500)! gen faminc=4000*educhd+err! save sim.dta, replace! clear! For an es7mator to be unbiased, mean of es7mates should be true value •  So, let’s sample 100 people from our popula7on 50 7mes and save the 50 es7mates from our OLS es7mator forvalues i=1/50 {! use sim.dta! bsample 100! reg faminc educhd! gen b`i'=_b[educhd]! drop faminc educhd err! drop if _n~=1! save samp`i'.dta, replace! }! clear! 4 Let’s look at the distribu7on of betas and the meanà༎ unbiased! .001 Density .002 .003 .004 Kernel density estimate 0 use samp1.dta! forvalues i=2/50 {! append using samp`i'.dta! }! save samp.dta, replace! gen id=_n! gen bmean=.! forvalues i=1/50 {! replace bmean=b`i' if id==`i'! }! sum bmean! kdensity bmean, yline(4000)! 3600 3800 4000 bmean 4200 4400 kernel = epanechnikov, bandwidth = 35.0667 Variable | Obs Mean Std. Dev. Min Max! -------------+--------------------------------------------------------! bmean | 50 4000.031 108.5178 3714.805 4303.958! Back to Stata 7ps 5 Fancy graph 7ps: 7tles scatter faminc educhd || lfit faminc educhd, title("Scatter Plot of Family Income and Head's Education in 2008") caption("Figure 1: Family income rises with education.")! 0 50000 100000 150000 200000 Scatter Plot of Family Income and Head's Education in 2008 8 10 12 14 COMPLETED ED-HD Total family income in 2008 16 18 Fitted values Figure 1: Family income rises with education. Fancy graph 7ps: side by sides histogram faminc, by(marriedhd)! Yes 1 1.0e-05 5.0e-06 0 Density 1.5e-05 2.0e-05 No 0 0 50000 100000 150000 200000 0 50000 100000 150000 200000 Total family income in 2008 Graphs by Head of household in 2009 is married 6 Fancy graph 7ps: combina7ons 0 5.0e-06 1.0e-05 1.5e-05 Twoway (histogram faminc) (kdensity faminc)! 0 50000 100000 Density 150000 200000 kdensity faminc Organize do ﬁles •  I usually have 3 do ﬁles for each project: –  Extract.do: takes raw data and brings it into Stata •  See PSIDsheet2 for direc7ons on how to get PSID data in a more reproducible way –  Mkvars.do: cleans up the Stata ﬁle crea7ng the variables I need –  Analysis.do: takes the clean Stata ﬁle and runs sample sta7s7cs and regressions for the paper •  Any7me I made big changes to the ﬁle, I give it a new name (usually the date) 7 ...
View Full Document

## This note was uploaded on 03/28/2012 for the course PADP 8130 taught by Professor Fertig during the Spring '12 term at LSU.

Ask a homework question - tutors are online