lecture4

# lecture4 - Economics 10: Introduction to Statistical...

This preview shows pages 1–13. Sign up to view the full content.

Economics 10: Introduction to Statistical Methods Class #4 More on Bivariate Statistics Data Transformations

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Correlation vs. Causation From The Elusive Quest for Growth : There are many stories about going astray mistaking correlation for causality. The most common story involves nineteenth- century Russian peasants. Supposedly the peasants noticed that the villages with a lot of smallpox also had more doctors’ visits than villages without smallpox. They drew the natural conclusion and started shooting the doctors.
Correlation vs. Causation History of malaria: In nineteenth century doctors did not understand what caused malaria. Based on observation they developed an “empirical theory”- they observed that people who lived or traveled close to swamps caught malaria. Hence they turned the association between the incidence of malaria and the presence of swamps into a causal relationship that the incidence of malaria was CAUSED by swamps- and elaborated the theory by arguing that malaria was transmitted by mists, bad airs, and miasmas emitted by swaps and bogs.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Last Class Bivariate statistics: relationship between variables Graphical Scatter plot, “best fit” (least squares) line, smooth fit (local linear regression) In STATA: scatter yvar xvar lfit yvar xvar lowess yvar xvar Numerical Covariance, correlation, least-squares regression slope In STATA…
Sample covariance in STATA Correlate, cov command in STATA calculate this correlate educ lnwage, covariance For short: correl educ lnwage, cov (or more generally : correl xvar yvar, cov ) Reports the “variance-covariance matrix”… ( 29 ( 29 = - - - = n i i i xy y y x x n s 1 1 1 X Y X S x 2 Y S xy S y 2 Var(X) Var(Y) Cov(X,Y)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Sample covariance in STATA s xy s y 2 s x 2 n
Correlation Could use numbers in the variance-covariance matrix to calculate the correlation by hand… …But STATA can also give it to you directly: correl lnwage educ Reports the “correlation matrix” 3395 . 0 6519 . 0 158 . 5 6225 . 0 2 2 = = = = y x xy y x xy s s s s s s r

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Sample correlation coefficient r
R 2 in “regress” output R 2

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
R-squared… R-squared is the fraction of variation in y “explained by” variation in x R 2 =1: a perfect fit; all points on regression line R 2 =0: no fit 0< R 2 <1: usual case; higher → better fit Related to sample correlation coefficient In bivariate linear regression, it is the square of the sample correlation coefficient: Size of R 2 says nothing about causality!! 2 2 r R =
Data Transformations

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Sometimes we want to create new variables in our data that are functions of old variables: Conversion – e.g., we want to convert data on height in inches into height in feet Standardization – convert fraction of questions answered correctly on the SAT into a score from 400 to 800 Other, nonlinear transformations – e.g. squared, logs, etc., more examples below. In STATA you create a new variable with the “gen”
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 05/05/2010 for the course ECON 010 taught by Professor Giummo during the Spring '08 term at Dartmouth.

### Page1 / 48

lecture4 - Economics 10: Introduction to Statistical...

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online