University of California, Los Angeles
Department of Statistics
Statistics C173/C273
Instructor: Nicolas Christou
Spatial statistics
•
Why spatial statistics? Noel Cressie (“Statistics for Spatial Data”) writes “why, how,
when” are not enough. We need to add “where”.
•
Today, spatial statistics models appear in areas such as mining, geology, hydrology,
ecology, environmental science, medicine, image processing, crop science, epidemiology,
forestry, atmospheric science, etc.
•
Need to develop models that deal with data collected from diﬀerent spatial locations.
•
The basic components are the spatial locations
{
s
1
,s
2
,
···
n
}
and the data observed
at these locations denoted as
{
Z
(
s
1
)
,Z
(
s
2
)
,
(
s
n
)
}
.
•
The distance between the observations is important in analyzing spatial data. With
distance we mostly mean “Euclidean distance”. However there are other forms of
distances (e.g. road miles, travel time, etc.). The latter is modeled through multidi
mensional scaling. Here we will consider mostly (if not always) Euclidean distances.
•
Consider the following example taught in all introductory statistics courses:
Let the spatial data
Z
(
s
1
)
(
s
2
)
,
(
s
n
) be an i.i.d. sample from
N
(
μ,σ
0
). The
MVUE of
μ
is
¯
Z
=
1
n
n
X
i
=1
Z
(
s
i
)
We know that
¯
Z
∼
N
(
μ,
σ
0
√
n
), and therefore we can easily construct a 95% conﬁdence
interval for
μ
as follows:
¯
Z
±
1
.
96
σ
0
√
n
•
The previous example assumes an i.i.d. sample. This can be too simplistic for spatial
data. A more realistic assumption is that the data exhibit some spatial correlation.
Suppose this spatial correlation is represented through the covariance function
cov
(
Z
(
s
i
)
(
s
j
)) =
σ
2
0
ρ

i

j

In the i.i.d. case
cov
(
Z
(
s
i
)
(
s
j
)) = 0 (independent therefore the covariance is zero).
1