day1lm - PADP 8130: Linear Models Introduc)on ...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: PADP 8130: Linear Models Introduc)on Spring 2012 Angela Fer:g, Ph.D. Gree:ngs •  Partner with someone •  Find something you have in common •  Introduce each other 1 3 hour survival plan •  Snacks – sign up sheet –  During break, we’ll go outside to eat/drink •  Only 1- 1.5 hour of lecture –  This means you have to read before class. •  Last half of class will involve “prac:ce”: –  We are going to work in Stata –  Work with data –  Work some problems Course Mechanics •  Prerequisites: 8120 (or some basic sta:s:cs & matrix algebra) •  Required Texts: Greene (any edi:on); Kennedy (any edi:on) •  Grading: –  Almost weekly homework sets (10%) •  Can work in groups, turn in separate work –  2 exams: in- class midterm, take- home final (30% each) –  1 group presenta:on (10%) •  Explain a published empirical paper’s results (group grading) –  1 paper (20%) •  Data/Methods/Results sec:on of your own original research •  Office hours: Thursdays 10- noon, or by appt •  Website: hap://hogwarts.spia.uga.edu/~afer:g/lmodels.html 2 Course Overview •  Introduc:on to linear regression techniques to analyze the rela:onship between a hypothesized cause and its effect using different types of data •  Goal is two- fold: –  You will be able to understand and cri:cize the research of others –  You will be able to do your own research Any ques:ons about course before we dive in? 3 What is Sta:s:cs? Methods for: –  Designing and conduc:ng empirical research studies –  Describing collected data –  Making decisions/inferences about phenomena represented by data What is Econometrics? “Field of economics that applies mathema:cal sta:s:cs and the tools of sta:s:cal inference to the empirical measurement of rela:onships postulated by economic theory.” Arguing causality is a goal For causa:on, we need 3 things: 1.  Associa)on: i.e. a sta:s:cally significant rela:onship between the two variables we are interested in 2.  Time ordering: i.e. cause comes from effect. Difficult for social science because we can’t do experiments and we ohen have “fixed” variables like race. 3.  No alterna)ve explana)ons, i.e. is it possible…? 4 For an associa:on, we need an es:mator Key Terms •  Parameters: characteris:cs of the popula:on about which we make inferences using sample data (the “truth”) •  Sta)s)cs: corresponding characteris:cs of the sample data upon which we base our inferences about parameters (es)mates of the “truth”) •  Es)mators: the formula by which the data are transformed into a sta:s:c or an es:mate For :me ordering, we need panel data Types of data •  Cross- sec)onal: a random sample where each observa:on is a different individual/firm with informa:on at a point in :me •  Time series: separate observa:on for each :me period (e.g. stock prices, GDP, unemployment rate) •  Panel or longitudinal: a random sample where each observa:on is followed over :me 5 To address alternate explana:ons, we need mul:ple regression •  The rela:onship could be spurious. –  e.g. Ice cream consump:on and spousal abuse complaints are associated – should we ban ice cream? No. There is no causal rela:onship, because both are caused by another variable – hot weather. •  The rela:onship could work through another variable (a chain rela)onship) –  e.g. Being employed may be associated with more preventa:ve health care. Why would that be? There is a media:ng variable – health insurance. Employed people are much more likely to have health insurance and thus get preventa:ve care. •  The rela:onship could be condi)onal on another variable. –  e.g. As the price of cigareaes goes up, cigareae consump:on goes down for young adults. There is almost no effect for older smokers (who are more likely to be very addicted). Thus the rela:onship between cigareae price and consump:on is condi:onal on age. Criteria of preferred es:mators •  A main focus of this course is knowing how to choose an appropriate es:mator •  We’ll discuss 5 criteria for judging es:mators; each researcher has to evaluate the importance of each of these criteria for their par:cular project 6 1. Minimizing weighted sum of residuals First, some terms: –  Determinis)c: a rela:onship that is exactly determined by some func:on –  Stochas)c: a rela:onship that is approximated by some func:on, but includes some error –  Disturbance/error/residual term: a term that captures the size of the errors in a stochas:c rela:onship •  Not because our func:on is a bad one •  Because measures may not be perfect, variability across people Example •  Say we could run this experiment: –  We have 8 low- income families, each with one daughter, age 10, who scored poorly on a standardized school test –  We move these 8 families to different neighborhoods with different poverty rates, and aher a year, have the girls take the test again •  We have 2 variables: –  Test score 1 year aher the move. This is the dependent variable. This is what we are interested in predic:ng. –  Neighborhood poverty rate. This is the independent variable. This is what we think predicts the dependent variable. •  Note that we think we have a clear causal “story”. We change the neighborhood poverty rate and the girls’ school performance change. Generally, causality can be more difficult to ascertain. 7 Here’s our data Poverty Test rate score Ava 4% 85 Bella 6% 80 Clara 8% 83 Dolores 10% 75 Evie 12% 60 Fern 14% 70 Gabbie 16% 55 Hermione 18% 50 Girl It appears that higher poverty rates result in lower test scores. Scaaer- plot I fit a line “by eye” for now by trying to minimize the differences between each point and the line. 90 85 Test score 80 75 70 residual 65 Evie 60 55 50 0 2 4 6 8 10 12 14 16 18 20 Poverty rate 8 2. Unbiasedness First, some terms: –  Popula)on distribu)on: We don’t know this, but we want to know about it (e.g. the true mean/parameter). –  Sample distribu)on: We know this, and calculate sta:s:cs such as the sample mean and the sample standard devia.on from it. –  Sampling distribu)on: This describes the variability in value of the sample means amongst all of the possible samples of a certain size. •  E.g. draw 2000 repeated samples from the popula:on distribu:on and plot the distribu:on of the 2000 sample means An es:mator is unbiased if the mean of its sampling distribu:on is equal to the true value of the parameter being es:mated. –  That is, if we could take a large number of samples, we would get the correct es:mate “on average” using this es:mator. 3. Efficiency An es:mator is efficient if its sampling distribu:on has small variance. –  The unbiased es:mator with the smallest variance is called the best unbiased es)mator. –  Because it is difficult to determine mathema:cally which unbiased es:mator has the smallest variance, and it is more tractable to find the unbiased linear es:mator with the smallest variance, econometricians ohen focus on the best unbiased linear es)mator (BLUE). 9 4. Mean Square Error (MSE) MSE is a weighted average of bias and variance so that biased es:mators with really low variance can be considered as well. –  Only used when all unbiased es:mators have high variance 5. Asympto:c proper:es An es:mator may be biased or have high variance for small sample sizes, but it may have “good” proper:es in extremely large samples (asympto:cally). –  A consistent es)mator can be thought to have, in the limit, zero bias and zero variance (large sample equivalent of the minimum MSE) –  An asympto)cally efficient es)mator has a variance that goes to zero faster than the variance of any other consistent es:mator. 10 Organizing principle of econometrics The Classical Linear Regression Model makes 5 assump:ons: 1. The func:onal form is Y = α+βX+ε 2. Zero mean of the disturbance/error 3. Disturbance terms have same variance (homoskedas:city) & are not correlated with one another (non- autocorrela:on) 4.  Uncorrelatedness of regressor and disturbance (regressors fixed in repeated samples) 5.  No exact linear rela:onships between regressors 11 ...
View Full Document

This note was uploaded on 03/28/2012 for the course PADP 8130 taught by Professor Fertig during the Spring '12 term at LSU.

Ask a homework question - tutors are online