paneldatanotes

# paneldatanotes - Regression with Panel Data(SW Ch 8 A panel...

This preview shows pages 1–4. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Regression with Panel Data (SW Ch. 8) A panel dataset contains observations on multiple entities (individuals), where each entity is observed at two or more points in time. Examples : • Data on 420 California school districts in 1999 and again in 2000, for 840 observations total. • Data on 50 U.S. states, each state is observed in 3 years, for a total of 150 observations. • Data on 1000 individuals, in four different months, for 4000 observations total. Notation for panel data A double subscript distinguishes entities (states) and time periods (years) i = entity (state), n = number of entities, so i = 1,…, n t = time period (year), T = number of time periods so t =1,…, T Data: Suppose we have 1 regressor. The data are: ( X it , Y it ), i = 1,…, n , t = 1,…, T Panel data notation, ctd. Panel data with k regressors: ( X 1 it , X 2 it ,…, X kit , Y it ), i = 1,…, n , t = 1,…, T n = number of entities (states) T = number of time periods (years) Some jargon… • Another term for panel data is longitudinal data • balanced panel : no missing observations • unbalanced panel : some entities (states) are not observed for some time periods (years) Why are panel data useful? With panel data we can control for factors that: • Vary across entities (states) but do not vary over time • Could cause omitted variable bias if they are omitted • are unobserved or unmeasured – and therefore cannot be included in the regression using multiple regression Here’s the key idea: If an omitted variable does not change over time, then any changes in Y over time cannot be caused by the omitted variable. 8-1 Example of a panel data set: Traffic deaths and alcohol taxes Observational unit: a year in a U.S. state • 48 U.S. states, so n = of entities = 48 • 7 years (1982,…, 1988), so T = # of time periods = 7 • Balanced panel, so total # observations = 7 48 = 336 Variables: • Traffic fatality rate (# traffic deaths in that state in that year, per 10,000 state residents) • Tax on a case of beer • Other (legal driving age, drunk driving laws, etc.) Traffic death data for 1982 Higher alcohol taxes, more traffic deaths? Traffic death data for 1988 Higher alcohol taxes, more traffic deaths? 8-2 Why might there be higher more traffic deaths in states that have higher alcohol taxes? Other factors that determine traffic fatality rate: • Quality (age) of automobiles • Quality of roads • “Culture” around drinking and driving • Density of cars on the road These omitted factors could cause omitted variable bias. Example #1: traffic density. Suppose: (i) High traffic density means more traffic deaths (ii) (Western) states with lower traffic density have lower alcohol taxes • Then the two conditions for omitted variable bias are satisfied. Specifically, “high taxes” could reflect “high traffic density” (so the OLS coefficient would be biased positively – high taxes, more deaths) • Panel data lets us eliminate omitted variable bias when the omitted variables are constant over time within a given state.Panel data lets us eliminate omitted variable bias when the omitted variables are constant over time within a given state....
View Full Document

{[ snackBarMessage ]}

### Page1 / 15

paneldatanotes - Regression with Panel Data(SW Ch 8 A panel...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online