lecture_8_slides

# lecture_8_slides - 10-1 Data on 1000 individuals in four...

10-1 Regression with Panel Data (SW Chapter 10) A panel dataset contains observations on multiple entities (individuals), where each entity is observed at two or more points in time. Hypothetical examples : Data on 420 California school districts in 1999 and again in 2000, for 840 observations total. Data on 50 U.S. states, each state is observed in 3 years, for a total of 150 observations. Data on 1000 individuals, in four different months, for 4000 observations total.

10-2 Notation for panel data A double subscript distinguishes entities (states) and time periods (years) i = entity (state), n = number of entities, so i = 1,…, n t = time period (year), T = number of time periods so t =1,…, T Data: Suppose we have 1 regressor. The data are: ( X it , Y it ), i = 1,…, n , t = 1,…, T
10-3 Panel data notation, ctd. Panel data with k regressors: ( X 1 it , X 2 it ,…, X kit , Y it ), i = 1,…, n , t = 1,…, T n = number of entities (states) T = number of time periods (years) Some jargon… Another term for panel data is longitudinal data balanced panel : no missing observations (all variables are observed for all entites [states] and all time periods [years])

10-4 Why are panel data useful? With panel data we can control for factors that: Vary across entities (states) but do not vary over time Could cause omitted variable bias if they are omitted are unobserved or unmeasured – and therefore cannot be included in the regression using multiple regression Here’s the key idea: If an omitted variable does not change over time, then any changes in Y over time cannot be caused by the omitted variable.
10-5 Example of a panel data set: Traffic deaths and alcohol taxes Observational unit: a year in a U.S. state 48 U.S. states, so n = of entities = 48 7 years (1982,…, 1988), so T = # of time periods = 7 Balanced panel, so total # observations = 7 × 48 = 336 Variables: Traffic fatality rate (# traffic deaths in that state in that year, per 10,000 state residents) Tax on a case of beer Other (legal driving age, drunk driving laws, etc.)

10-6 U.S. traffic death data for 1982: Higher alcohol taxes, more traffic deaths?
10-7 U.S. traffic death data for 1988 Higher alcohol taxes, more traffic deaths?

10-8 Why might there be higher more traffic deaths in states that have higher alcohol taxes? Other factors that determine traffic fatality rate: Quality (age) of automobiles Quality of roads “Culture” around drinking and driving Density of cars on the road
10-9 These omitted factors could cause omitted variable bias. Example #1: traffic density.

