CSE 555 Spring 2010 Homework 1: Bayesian Decision Theory
Jason J. Corso
Computer Science and Engineering
SUNY at Buffalo SUNY
[email protected]
Date Assigned 13 Jan 2010
Date Due 1 Feb 2010
Homework must be submitted in class. No late work will be accepted.
Problem 1: Bayesian Decision Rule (30%)
Suppose the task is to classify the input signal
x
into one of
K
classes
ω
∈ {
1
,
2
, . . . , K
}
such that the
action
α
(
x
) =
i
means classifying
x
into class
i
. The Bayesian decision rule is to maximize the posterior
probability
α
Bayes
(
x
) =
ω
*
= arg max
ω
p
(
ω

x
)
.
Suppose we replace it by a
randomized decision rule
, which classifies
x
to class
i
following the posterior
probability
p
(
ω
=
i

x
)
, i.e.,
α
rand
(
x
) =
ω
∼
p
(
ω

x
)
.
Solution:
Maximizing the posterior probability is equivalent to minimizing the overall risk.
Using the zeroone loss function, the overall risk for the Bayes Decision Rule is:
R
Bayes
=
R
(
α
Bayes
(
x
)

x
)
p
(
x
)
dx
=
1

max
P
(
ω
j

x
)

j
= 1
, ..., k
p
(
x
)
dx
For simplicity, the class with max posterior probability is abbreviated as
ω
max
, and
we get:
R
Bayes
=
(1

P
(
ω
max

x
))
p
(
x
)
dx.
1. What is the overall risk
R
rand
for this decision rule? Derive it in terms of the posterior probability
using the zeroone loss function.
Solution:
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
For any given
x
, the probability of each class
j
= 1
, ..., k
being the correct
class is
P
(
ω
j

k
)
.
With the randomized algorithm, it will select the correct
class with probability
P
(
ω
j

k
)
, which means that it will select the wrong class
with probability
1

P
(
ω
j

k
)
.
Thus, the zeroone conditional risk will become
∑
j
P
(
ω
j

x
) 1

P
(
ω
j

x
)
on average. Thus,
R
rand
=
j
P
(
ω
j

x
) 1

P
(
ω
j

x
)
p
(
x
)
dx
=
j
P
(
ω
j

x
)

P
(
ω
j

x
)
2
p
(
x
)
dx
=
1

j
P
(
ω
j

x
)
2
p
(
x
)
dx
2. Show that this risk
R
rand
is always no smaller than the Bayes risk
R
Bayes
. Thus, we cannot benefit
from the randomized decision.
Solution:
Proving
R
rand
≥
R
Bayes
is equivalent to proving
∑
j
P
(
ω
j

x
)
2
≤
P
(
ω
max

x
)
:
j
P
(
ω
j

x
)
2
≤
j
P
(
ω
j

x
)
P
(
ω
max

x
) =
P
(
ω
max

x
)
,
thus proved.
R
rand
is always no smaller than
R
Bayes
.
3. Under what conditions on the posterior are the two decision rules the same?
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 wu
 Bayesian probability, Posterior probability

Click to edit the document details