Semester Project SYA 6305
See the ‘Data Management in Stata’ document.
Present appropriate hypotheses to guide your project.
Keep all of the data and files for this, or any, project in its own exclusive folder.
Save a backup copy of the original (‘master’) data set.
Never save changes made to the original (‘master’) data set, but rather
make the changes by means of do-files and save the changes as new,
working versions of the data sets (see below on nested do-files).
Log file, command file, and do-file
Make a log file, command file, and do-file for every step.
Nest the do-files: nested, shorter do-files are more manageable than long
do-files (see ‘Nested Do-Files’).
Every step you do must be documented and replicable.
If helpful, restrict the working version of the data set to the variables that
are relevant to your project (including variables that will help you explore
Log, command, and do files are necessary for this step, as well as for all of the
help keep; help drop
The Stata commands ‘inspect’ and ‘codebook’ provide excellent overviews of
Key questions to answer about the data
Who funded the collection of the data, and who collected the data and for
Do your answers suggest any possible biases in the
Who do the data represent, how were the data collected, and how many
observations are there?
In view of these questions, to what extent are
the data adequate or not for your study?
What are the data’s variables?
In view of your study’s purpose, are
potentially important variables not included?
If so, what variables are
absent, and why are they important?
How are each of the variables defined and measured?
To what extent are
the definitions and measurements valid or not, and to what extent are the
measurements likely to be reliable (if you have some way of judging)
(see, e.g., Babbie,
The Practice of Social Research
, on ‘validity’ and
Insofar as the operationalization could be improved, are
there mitigating circumstances, and if there are, what are they? Are there
ways that you might be able to improve the operationalization of the