Multiple Regression
Analysis: Inference
•
Statistical inference in the regression model
•
Hypothesis tests about population parameters
•
Construction of confidence intervals
•
Sampling distributions of the OLS estimators
•
The OLS estimators are random variables
•
We already know their expected values and their variances
•
However, for hypothesis tests we need to know their distribution
•
In order to derive their distribution we need additional assumptions
•
Assumption about distribution of errors: normal distribution
1

Multiple Regression
Analysis: Inference
•
Assumption MLR.6 (Normality of error terms)
independently of
It is assumed that the unobserved
factors are normally distributed around the
population regression function.
The form and the variance of the distribution
does not depend on
any of the explanatory variables.
It follows that:
2

Multiple Regression
Analysis: Inference
•
Discussion of the normality assumption
•
The error term is the sum of “many” different unobserved factors
•
Sums of independent factors are normally distributed (CLT)
•
Problems:
•
How many different factors? Number large enough?
•
Possibly very heterogenuous distributions of individual factors
•
How independent are the different factors?
•
The normality of the error term is an empirical question
•
At least the error distribution should be “close” to normal
•
In many cases, normality is questionable or impossible by definition
3

Multiple Regression
Analysis: Inference
•
Discussion of the normality assumption (cont.)
•
Examples where normality cannot hold:
•
Wages (nonnegative; also: minimum wage)
•
Number of arrests (takes on a small number of integer values)
•
Unemployment (indicator variable, takes on only 1 or 0)
•
In some cases, normality can be achieved through transformations
of the
dependent variable (e.g. use log(wage) instead of wage)
•
Under normality, OLS is the best (even nonlinear) unbiased estimator
•
Important: For the purposes of statistical inference, the assumption
of
normality can be replaced by a large sample size
4

Multiple Regression
Analysis: Inference
•
Terminology
•
Theorem 4.1 (Normal sampling distributions)
Under assumptions MLR.1 – MLR.6:
The estimators are normally distributed
around the true parameters with the variance
that was derived earlier
The standardized estimators follow a standard
normal distribution
“Gauss-Markov assumptions”
“Classical linear model (CLM) assumptions”
5

Multiple Regression
Analysis: Inference
•
Testing hypotheses about a single population parameter
•
Theorem 4.2 (t-distribution for the standardized estimators)
•
Null hypothesis (for more general hypotheses, see below)
Under assumptions MLR.1 – MLR.6:
If the standardization is done using the estimated
standard deviation (= standard error), the normal
distribution is replaced by a t-distribution
The population parameter is equal to zero, i.e. after
controlling for the other independent variables, there is no
effect of x
j
on y
Note: The t-distribution is close to the standard normal distribution if n-k-1 is large.

#### You've reached the end of your free preview.

Want to read all 28 pages?

- Fall '19
- Statistics, Normal Distribution, Regression Analysis, Statistical hypothesis testing