day1prlm

# Day1prlm - PADP 8130 Linear Models Simple Linear Regression Model PRACTICE Angela Fer9g Ph.D OLS in Stata

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1/13/12 PADP 8130: Linear Models Simple Linear Regression Model PRACTICE Angela Fer9g, Ph.D. OLS in Stata . use "/Users/afertig/Documents/Teaching/PADP8130_Spring2012/day2/day2.dta”! . drop if faminc>=200000! (335 observations deleted)! . drop if faminc==0! (52 observations deleted)! ! . reg faminc educhd! ! Source | SS df MS Number of obs = 7910! -------------+-----------------------------F( 1, 7908) = 1180.26! Model | 1.8083e+12 1 1.8083e+12 Prob > F = 0.0000! Residual | 1.2116e+13 7908 1.5321e+09 R-squared = 0.1299! -------------+-----------------------------Adj R-squared = 0.1298! Total | 1.3924e+13 7909 1.7605e+09 Root MSE = 39142! ! ------------------------------------------------------------------------------! faminc | Coef. Std. Err. t P>|t| [95% Conf. Interval]! -------------+----------------------------------------------------------------! educhd | 5901.34 171.7755 34.35 0.000 5564.614 6238.065! _cons | -18893.11 2260.571 -8.36 0.000 -23324.42 -14461.79! ------------------------------------------------------------------------------! 1 1/13/12 Thinking about OLS graphically ScaGer plots 0 50000 100000 150000 ! 200000 . set seed 5000! . bsample 50! . Scatter faminc educhd || lfit faminc educhd! 8 10 12 14 COMPLETED ED-HD Total family income in 2008 16 18 Fitted values Eye- balling it: Y- intercept is nega9ve and the slope is about 50000/8=6250 5.0e-06 Density 1.0e-05 1.5e-05 Another useful graph: Histograms 0 . clear! . use day2.dta! . drop if faminc>=200000! (335 observations deleted)! 0 . drop if faminc==0! (52 observations deleted)! . histogram faminc! (bin=39, start=0, width=5115.3846)! 50000 100000 150000 Total family income in 2008 200000 2 1/13/12 Can change the number of bins 0 5.0e-06 Density 1.0e-05 1.5e-05 . histogram faminc, bin(10)! (bin=10, start=0, width=19950)! 0 50000 100000 Total family income in 2008 150000 200000 A line/smooth version of the histogram is the kernal density graph Kdensity faminc! Density 5.000e-06 .00001 .000015 Kernel density estimate 0 •  0 50000 100000 150000 Total family income in 2008 200000 kernel = epanechnikov, bandwidth = 6.1e+03 3 1/13/12 Let’s look at unbiasedness with fancy Stata simula9ons •  Let’s assume that our PSID 2009 sample is the true popula9on Use day2.dta! •  We’ll take head’s educa9on and make a “true” rela9onship between educa9on and income keep educhd! set seed 5000! gen err=rnormal(-2500,2500)! gen faminc=4000*educhd+err! save sim.dta, replace! clear! For an es9mator to be unbiased, mean of es9mates should be true value •  So, let’s sample 100 people from our popula9on 50 9mes and save the 50 es9mates from our OLS es9mator forvalues i=1/50 {! use sim.dta! bsample 100! reg faminc educhd! gen b`i'=_b[educhd]! drop faminc educhd err! drop if _n~=1! save samp`i'.dta, replace! }! clear! 4 1/13/12 Let’s look at the distribu9on of betas and the meanà༎ unbiased! .001 Density .002 .003 .004 Kernel density estimate 0 use samp1.dta! forvalues i=2/50 {! append using samp`i'.dta! }! save samp.dta, replace! gen id=_n! gen bmean=.! forvalues i=1/50 {! replace bmean=b`i' if id==`i'! }! sum bmean! kdensity bmean, yline(4000)! 3600 3800 4000 bmean 4200 4400 kernel = epanechnikov, bandwidth = 35.0667 Variable | Obs Mean Std. Dev. Min Max! -------------+--------------------------------------------------------! bmean | 50 4000.031 108.5178 3714.805 4303.958! Back to Stata 9ps 5 1/13/12 Fancy graph 9ps: 9tles scatter faminc educhd || lfit faminc educhd, title("Scatter Plot of Family Income and Head's Education in 2008") caption("Figure 1: Family income rises with education.")! 0 50000 100000 150000 200000 Scatter Plot of Family Income and Head's Education in 2008 8 10 12 14 COMPLETED ED-HD Total family income in 2008 16 18 Fitted values Figure 1: Family income rises with education. Fancy graph 9ps: side by sides histogram faminc, by(marriedhd)! Yes 1 1.0e-05 5.0e-06 0 Density 1.5e-05 2.0e-05 No 0 0 50000 100000 150000 200000 0 50000 100000 150000 200000 Total family income in 2008 Graphs by Head of household in 2009 is married 6 1/13/12 Fancy graph 9ps: combina9ons 0 5.0e-06 1.0e-05 1.5e-05 Twoway (histogram faminc) (kdensity faminc)! 0 50000 100000 Density 150000 200000 kdensity faminc Organize do ﬁles •  I usually have 3 do ﬁles for each project: –  Extract.do: takes raw data and brings it into Stata •  See PSIDsheet2 for direc9ons on how to get PSID data in a more reproducible way –  Mkvars.do: cleans up the Stata ﬁle crea9ng the variables I need –  Analysis.do: takes the clean Stata ﬁle and runs sample sta9s9cs and regressions for the paper •  Any9me I made big changes to the ﬁle, I give it a new name (usually the date) 7 ...
View Full Document

## This note was uploaded on 03/28/2012 for the course PADP 8130 taught by Professor Fertig during the Spring '12 term at LSU.

Ask a homework question - tutors are online