This preview shows page 1. Sign up to view the full content.
Unformatted text preview: 1/4/11 PADP 8120: Data Analysis and Sta5s5cal Modeling Introduc)on PRACTICE Spring 2011 Angela Fer5g, Ph.D. Panel Study of Income Dynamics Na5onal representa5ve longitudinal study of nearly 9000 US families Has followed the same families since 1968 Collects data on economic, health and social behavior Also include a supplement on child development, collec5ng informa5on on educa5on, health, cogni5ve and behavioral development and 5me use 1 1/4/11 PSID sampling Consists of 2 independent samples: A na5onal sample (SRC): equal probability sample of households from 48 states A low-income sample (SEO): sample of low-income households with heads<age60, from SMSAs in the north and non-SMSAs in the south Family level ques5ons asked about family, head and "wife" Individual level ques5ons asked about each family member Data structure Let's get some data You must use PSID to do your independent research project While the data is longitudinal choose only one cross-sec5on of data You can use family or individual ques5ons, but family is easier 2 1/4/11 Now let's learn some Stata When you open Stata, four windows should open automa5cally: 1. The command window is where all your commands are typed. 2. The review window logs all commands (from the command window) as they are entered. Click on an old command, and it will appear again in the command window. 3. The variables window lists all variables in the working file. Click on a variable, and it will appear in the command window. 4. The results window displays results. It can display only a limited number of lines at a 5me. If your results are going to be very long, use a log file. Two other important windows The do-file editor is a workspace where you can write, edit, and save STATA commands. Rather than entering these commands in the command window, you can run them from the do-file editor. The advantage is that you can easily edit and save all your commands. Click on the pencil and pad icon in the toolbar to view the do- file editor. The data editor allows you to enter, view, or edit your data file. It looks like a spreadsheet. Typically, variables are listed across the top, and cases are listed down the side. This window must be closed in order to run commands in STATA. Click on the pencil and spreadsheet icon in the toolbar to view the data editor. 3 1/4/11 Simple Syntax Repeat a command for each subset of the data for which the values of the variables in this The list of varlist are equal. variables. Restrict the scope of the command to a specific observation range. Additional command-specific options. [By varlist:] command [varlist] [=exp] [if exp] [in range] [weight] [, options] Specify the value to be assigned to a variable. Restrict the scope of the command to those observation for which the value of the expression is true. Indicate the weight to be attached to each observation. Syntax, cont. Everything in square brackets is op5onal, so most of your code will be much simpler. This descrip5on will help you understand the help files 4 1/4/11 Let's open up our data in Stata Open Stata Open up the do-file editor Open up the do file downloaded from the PSID Change the [path] to your computer's path: Add this line to the very bohom of the file: Save this do file Run it (click the last icon in the toolbar) Save [path]\day1.dta, replace; In Windows, this will be something like "C:\\Documents and Segngs\whatever\" In Mac, this will be something like "\Users\afer5g\Documents \whatever\" Let's look at the data There should be 1 command in the review window, a list of variables (ER36002, etc.) in the variables window, and a record of what just happened in the results window Useful commands to look at your data: Try each of these commands. There are a lot of observa5ons. You can see more by higng the space bar or the green down arrow, or stop it by higng the red X icon. List varlist: lists values of variables Inspect varlist: provides a summary of a variable including a histogram Codebook varlist: produces a codebook describing the data Describe varlist: describes contents of data Summarize varlist: provides summary sta5s5cs Tabulate varname1 [varname2]: provides frequencies of responses 5 1/4/11 Transform/Recode the data Data are usually not available in precisely the form you might want, so we need to manipulate the data Useful commands to transform your data Recode: changes/consolidates/rearranges values of a variable, can create new variable Generate and replace: creates new variable and changes values Drop and keep: deletes variables or observa5ons Rename: renames first listed variable with second listed name Tabulate varname, gen(newname): makes dichotomous variables for each value of the variable Before we do more, let's learn about do files We can type commands in the command line and change the data But when we close Stata, those commands are gone and to do it again, we'd have to retype everything again If we want to keep a record of what we have done and be able to re-do it, we should write a program with all of the stata code this is a do file 6 1/4/11 Your first do file Open a blank do file in the do file editor Here are the first few lines I always include: clear all cap log close set memory 400M set matsize 550 set more 1 cd "\Users\afer5g\Documents\whatever\" log using mkvars20110111.log, replace use filename.dta Save the file give it a useful name like mkvars20110111.do Note: it is some5mes useful to write comments in the do file which you don't want Stata to do anything with just put a * in front of the line and Stata will ignore it Now let's make our variables using the do file Open up the pdf codebook; let's go through each variable in order rename ER36002 fid rename ER36017 agehd replace agehd=. if agehd==999 recode ER36018 (2=1) (1=0), into(femalehd) recode ER36023 (2/5=0) (8/9=.), into(marriedhd) Create the following variables: Con5nuous: fid, agehd, faminc Dichotomous: femalehd, marriedhd, fstamps, tanf, hisp, black, hsdropout, somecoll, collgrad, educhd Then, drop all of the ER variables and summarize the new variables Then, save the new data file: save day2.dta, replace 7 1/4/11 Labels So you don't forget what you have created or changed, it is useful to gives things labels Here are the commands: To give a variable a label, type label variable varname "label" To give the values of a variable labels, type label define lblname # "label" [# "label"...] label values varname [lblname] Label examples label variable fid "2007 family id number" label variable femalehd "Head of household in 2007 is female" label define noyes 0 "no" 1 "yes" label values femalehd noyes label variable marriedhd "Head of household in 2007 is married" label values marriedhd noyes (Note: May not need to label variables if use rename.) Create appropriate labels for all of your variables. 8 1/4/11 Homework Read A&F Chs 1-3 Go to UCLA Stata Starter Kit webpage: Read through the following modules: hhp://www.ats.ucla.edu/stat/stata/sk/modules_sk.htm Homework 1: Go to PSID data center: hhp://simba.isr.umich.edu/ Create a new dataset on your own with at least 5 variables different from the ones we did today. Pick variables you might be interested in using for your original research. Bring the data into Stata and recode the variables into a nice format turn in a summary of your variables (means/percents) that is easily understandable to the average person (me). Descrip5ve informa5on and sta5s5cs Gegng help Using "if" for subsegng with Stata commands Using and saving Stata data files Labeling data, variables and values Crea5ng and recoding variables Subsegng variables and observa5ons 9 ...
View Full Document
This note was uploaded on 01/18/2012 for the course PADP 8120 taught by Professor Fertig during the Summer '11 term at University of Georgia Athens.
- Summer '11