C-1
Appendix C
Ordinary Least Squares and
Poisson Regression Models
by
Luc Anselin
University of Illinois
Champaign-Urbana, IL
This note provides a brief description of the statistical background, estimators and model
characteristics for a regression specification, estimated by means of both Ordinary Least Squares
(OLS) and Poisson regression.
Ordinary Least Squares Regression
With an assumption of normality for the regression error term, OLS also corresponds to
Maximum Likelihood (ML) estimation. The note contains the statistical model and all expressions
that are needed to carry out estimation and essential model diagnostics.
Both concise matrix
notation as well as more extensive full summation notation are employed, to provide a direct link
to “loop” structures in the software code, except when full summation is too unwieldy (e.g., for
matrix inverse). Some references are provided for general methodological descriptions.
Statistical Issues
The classical multivariate linear regression model stipulates a linear relationship between
a
dependent
variable (also called a response variable) and a set of
explanatory
variables (also
called independent variables, or covariates). The relationship is stochastic, in the sense that the
model is not exact, but subject to random variation, as expressed in an
error
term (also called
disturbance term).
Formally, for each observation
i
, the value of the dependent variable,
y
i
is related to a sum
of
K
explanatory variables, x
ih
, with
h
=1,...,
K
, each multiplied with a regression
coefficient
,
β
h
,
and the random error term,
ε
i
:
K
y
i
=
Σ
x
ih
β
h
+
ε
i
(C-1)
h=1
Typically, the first explanatory variable is set equal to one, and referred to as the
constant
term
. Its coefficient is referred to as the
intercept
, the other coefficients are
slopes
. Using a
constant term amounts to extracting a mean effect and is equivalent to using all variables as
deviations from their mean.
In practice, it is highly recommended to
always
include a constant
term.
In matrix notation, which summarizes all observations,
i=1,...,N
, into a single compact
expression, an
N by 1
vector of values for the dependent variable, y is related to an
N by K
matrix
of values for the explanatory variables,
X
, a
K by 1
vector of regression coefficients,
β
, and an
N
by 1
vector of random error terms,
ε
:
y= X
β
+
ε
(C-2)