The mean and variance of all of the values are the mean and variance of its
E( ) and var( ).
(remember: is a sample statistic.)
VIP: The concept of the sampling distribution underpins all of inference in
1 corr(X,Z) 1
corr(X,Z) = 1 mean perfect positive linear association
corr(X,Z) = 1 means perfect negative linear association
corr(X,Z) = 0 means no linear association
Correlation coefficient is unitless, so it avoids the problems of the cov
= P(Y=1| X=0) = ? (hint: recall the answer above)
Conditional Distribution (cont.)
Question from previous slide (cont.)
prob. of short commute (Y=1) if you know its raining (X=0)
Now, check your answer by calculation
Moments (cont.) (note 1-17)
= measure of mass in tails
= measure of probability of large values
kurtosis = 3: normal distribution
kurtosis > 3: heavy tails (leptokurtotic)
Two random variables
Random variables X and Y
(a) Single random variable (note 1-10)
The group or collection of all possible entities of interest (school districts)
We will think of populations as infinitely large ( is an approximation to very big)
Whats a sample?
Standard deviation across districts = 19.1
Is this a big enough difference to be important for school reform discussions, for parents,
or for a school committee?
What does this tell us about the population?
2. Hypothesis testing (note 1
Learn to evaluate the regression analysis of others this means you will be able to
read/understand empirical economics papers in other econ courses;
Get some hands-on experience with regression analysis in your problem sets.
Introduction to Econometrics is title of text
What is econometrics?
What is it?
Science (& art!)
Broadly, using theory and statistical methods to analyze data
What are some uses?
Forecast values (e.g., firms sales, unemp
var( ) is inversely proportional to n
the spread (standard deviation) of the sampling
distribution is proportional to 1/
Thus the sampling uncertainty associated with is
proportional to 1/ (larger samples, less uncertain
We will assume simple random sampling
Choose an individual (district, entity) at random from the population
Randomness and data
Prior to sample selection, the value of Y is random because the individual selected is
Once the indivi