Unformatted text preview: 1/31/11 PADP 8120: Data Analysis and Sta5s5cal Modeling Sta$s$cal Inference 1 Confidence Intervals PRACTICE Spring 2011 Angela Fer5g, Ph.D. Open up day2.dta in Stata sum agehd reg agehd Open up excel & fill in mean standard devia5on sample size Now calculate (matching Stata results) Standard error t 95% CIlow 95% CIhigh 1 1/31/11 Go back to Stata In a new worksheet in excel, fill in mean standard devia5on sample size bsample 11 sum agehd reg agehd Now calculate (matching Stata results) Standard error t 95% CIlow 95% CIhigh Because of small sample size, have to use t distribu5on instead of normal to get CI. Clear Stata and open up day2.dta again sum femalehd Reg femalehd Open up excel & fill in mean standard devia5on sample size Now calculate (matching Stata results) Standard error t 95% CIlow 95% CIhigh Because binary variable, have to use different SE formula. 2 1/31/11 Here's how I drew the graph from lecture today Set seed 2010 Drawnorm pop, n(100000) means(0) sds(1) Kdensity pop, student(2) 5tle("") legend() Try this Open day2.dta Graph faminc (for incomes<200000) and put on a normal distribu5on on the same graph 3 1/31/11 Some useful data management commands: append, merge & reshape Append combine 2 data sets by stacking the observa5ons; it just increases the sample size Merge combine 2 data sets by increasing the variables observed on each observa5on; it increases what you know about each observa5on So, if you made your beau5ful datafile for your research project and it took hours, and then you realize that you are missing one variable, merge can fix it. Just get the addi5onal variable needed in a separate data file. Merge it with the new data file. Reshape Reshape: converts data from long to wide, or vice versa e.g. one observa5on has informa5on on husbands and wives, but you want husbands and wives to each be their own observa5on you need to reshape from wide to long id id 1 2 Wagehd 100000 35000 wagewf 10000 15000 1 1 2 2 person 1 2 1 2 Wagehd 100000 10000 35000 15000 4 1/31/11 Try it out Go to the psid data center and find the wage data for the head and wife make the data set Merge it with day2.dta Use day2.dta Sort fid Save day2.dta Use wage.dta Sort fid Save wage.dta clear Use day2.dta Merge fid using wage.dta Save day4.dta, replace Then reshape Reshape long so husbands and wives are separate observa5ons Use day4.dta Reshape long wage, i(fid) j(person) Save day4r.dta, replace Make sure beginning of variable name is same for both husband and wife. Fid is the unique iden5fier in the wide format. Person is a new variable that will be created to iden5fy the husbands vs. the wives for each fid. 5 1/31/11 For next 5me Read A&F Ch. 6 Go to UCLA Stata module webpage: hkp://www.ats.ucla.edu/stat/stata/modules/ Read through the intermediate data mgmt modules (combining data & reshape) Homework 3 due next 5me 6 ...
This note was uploaded on 01/18/2012 for the course PADP 8120 taught by Professor Fertig during the Summer '11 term at UGA.
 Summer '11
 FERTIG

