March
18,
2003
1.3 Bayes decision theory
.
The
distinguishing
feature
of
Bayesian
statistics
is
that
a
probability
distribution
π
,
called
a
prior
,
is
given
on
the
parameter
space
(Θ
,
T
).
Some-
times,
priors
are
also
considered
which
may
be
infinite,
such
as
Lebesgue
measure
on
the
whole
real
line,
but
such
priors
will
not
be
treated
here
at
least
for
the
time
being.
A
Bayesian
statistician
chooses
a
prior
π
based
on
whatever
information
on
the
un-
known
θ
is
available
in
advance
of
making
any
observations
in
the
current
experiment.
In
general,
no
definite
rules
are
prescribed
for
choosing
π
.
Priors
are
often
useful
as
technical
tools
in
reaching
non-Bayesian
conclusions
such
as
admissibility
in
Theorems
1.2.5
and
1.2.6.
Bayes
decision
rules
were
defined
near
the
end
of
the
last
section
as
rules
which
minimize
the
Bayes
risk
and
for
which
the
risk
is
finite.
Bayes
tests
of
P
vs.
Q
, treated
in
Theorem
1.1.8,
are
a
special
case
of
Bayes
decision
rules.
We
saw
in
that
case
that
Bayes
rules
need
not
be
randomized
(Remark
1.1.9).
The
same
is
true
quite
generally
in
Bayes
decision
theory:
if,
in
a
given
situation,
it
is
Bayes
to
choose
at
random
among
two
or
more
possible
decisions,
then
the
decisions
must
have
equal
risks
(conditional
on
the
observations)
and
we
may
as
well
just
take
one
of
them.
Theorem
1.3.1
will
give
a
more
precise
statement.
In
game
theory,
randomization
is
needed
to
have
a
strategy
that
is
optimal
even
if
the
opponent
knows
it
and
can
choose
a
strategy
accordingly.
If
one
knows
the
opponent’s
strategy
then
it
is
not
necessary
to
randomize.
Sometimes,
statistical
decision
theory
is
viewed
as
a
game
against
an
opponent
called
“Nature.”
Unlike
an
opponent
in
game
theory,
“Nature”
is
viewed
as
neutral,
not
trying
to
win
the
game.
Assuming
a
prior,
as
in
Bayes
decision
theory,
is
to
assume
in
effect
that
“Nature”
follows
a
certain
strategy.
In
showing
that
randomization
isn’t
needed,
it
will
be
helpful
to
formulate
randomiza-
tion
in
a
fuller
way,
where
we
not
only
choose
a
probability
distribution
over
the
possible
actions,
but
then
also
choose
an
action
according
to
that
distribution,
in
a
measurable
way,
as
follows:
Definition
.
A
randomized
decision
rule
d
:
X
D
E
is
realizable
if
there
is
a
probability
→
space
(Ω
,
F
, µ
)
and
a
jointly
measurable
function
δ
:
X
×
Ω
A
such
that
for
each
→
x
in
X
,
δ
(
x,
·
)
has
distribution
d
(
x
),
in
other
words
d
(
x
)
is
the
image
measure
of
µ
by
δ
(
x,
·
)
,
d
(
x
) =
µ
◦
δ
(
x,
·
)
−
1
.
For
example,
a
randomized
test
as
in
Sec.
1.1
is
always
a
realizable
rule,
where
we
can
take
Ω
as
the
interval
[0
,
1]
with
Lebesgue
measure
and
let
δ
(
x, t
) =
d
Q
if
t
≤
f
(
x
) and
d
P
otherwise.
It
is
shown
in
the
next
section
that
decision
rules
are
realizable
under
conditions
wide
enough
to
cover
a
great
many
cases,
for
example
whenever
the
action
space
is
a
subset
of
a space
R
k
with
Borel
σ
-algebra.
It
will
be
shown
next
that
randomization
is
unnecessary
for
realizable
Bayes
rules.
The
idea
is
that
the
Bayes
risk
of
a
realizable
randomized
Bayes
rule
d
(
·
)
is
an
average
of
Bayes
risks
of