This preview shows pages 1–5. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: PU/DSS/OTR Linear Regression (ver. 6.0) Oscar TorresReyna Data Consultant otorres@princeton.edu http://dss.princeton.edu/training/ PU/DSS/OTR Regression: a practical approach (overview) We use regression to estimate the unknown effect of changing one variable over another (Stock and Watson, 2003, ch. 4) When running a regression we are making two assumptions, 1) there is a linear relationship between two variables (i.e. X and Y ) and 2) this relationship is additive (i.e. Y= x1 + x2 + +xN ). Technically, linear regression estimates how much Y changes when X changes one unit. In Stata use the command regress, type: regress [dependent variable] [independent variable(s)] regress y x In a multivariate setting we type: regress y x1 x2 x3 Before running a regression it is recommended to have a clear idea of what you are trying to estimate (i.e. which are your outcome and predictor variables). A regression makes sense only if there is a sound theory behind it. 2 PU/DSS/OTR Regression: a practical approach (setting) Example : Are SAT scores higher in states that spend more money on education controlling by other factors?* Outcome (Y) variable SAT scores, variable csat in dataset Predictor (X) variables Per pupil expenditures primary & secondary ( expense ) % HS graduates taking SAT ( percent ) Median household income ( income ) % adults with HS diploma ( high ) % adults with college degree ( college ) Region ( region ) * Source : Data and examples come from the book Statistics with Stata (updated for version 9) by Lawrence C. Hamilton (chapter 6). Click here to download the data or search for it at http://www.duxbury.com/highered/ . Use the file states.dta (educational data for the U.S.). 3 PU/DSS/OTR Regression: variables It is recommended first to examine the variables in the model to check for possible errors, type: use http://dss.princeton.edu/training/states.dta describe csat expense percent income high college region summarize csat expense percent income high college region region byte %9.0g region Geographical region college float %9.0g % adults college degree high float %9.0g % adults HS diploma income double %10.0g Median household income, $1,000 percent byte %9.0g % HS graduates taking SAT expense int %9.0g Per pupil expenditures prim&sec csat int %9.0g Mean composite SAT score variable name type format label variable label storage display value . describe csat expense percent income high college region region 50 2.54 1.128662 1 4 college 51 20.02157 4.16578 12.3 33.3 high 51 76.26078 5.588741 64.3 86.6 income 51 33.95657 6.423134 23.465 48.61851 33....
View
Full
Document
This note was uploaded on 12/10/2010 for the course ECON 3101 taught by Professor Staff during the Spring '08 term at Minnesota.
 Spring '08
 Staff
 Economics

Click to edit the document details