This preview shows pages 1–2. Sign up to view the full content.
CS 70
Discrete Mathematics and Probability Theory
Spring 2010
Alistair Sinclair
Lecture 18
Multiple Random Variables and Applications to Inference
In many probability problems, we have to deal with
multiple
r.v.’s deﬁned on the same probability space. We
have already seen examples of that when we saw, for example, that computing the expectation and variance
of a binomial r.v.
X
, it is easier to write it as a sum
X
=
∑
n
i
=
1
X
i
where
X
i
represents the result of the
i
’th
trial. In inference problems, where we observe certain quantities and use the information to infer about other
hidden quantities, multiple r.v.’s arise naturally in the modeling of the situation. We will see some examples
of such problems after we go through some of the basics in the handling of multiple r.v.’s.
Joint Distributions
Consider two random variables
X
and
Y
deﬁned on the same probability space. By linearity of expectation,
we know that E
(
X
+
Y
) =
E
(
X
) +
E
(
Y
)
. Since E
(
X
)
can be calculated if we know the distribution of
X
and E
(
Y
)
can be calculated if we know the distribution of
Y
, this means that E
(
X
+
Y
)
can be computed
knowing only the two individual distributions. No information is needed about the
relationship
between
X
and
Y
. This is not true if we need to compute, say, E
((
X
+
Y
)
2
)
, e.g. as when we computed the variance
of a binomial r.v. This is because E
((
X
+
Y
)
2
) =
E
(
X
2
) +
2E
(
XY
) +
E
(
Y
2
)
, and E
(
XY
)
depends on the
relationship between
X
and
Y
. How can we capture such a relationship?
Recall that the distribution of a single random variable
X
is the collection of the probabilities of all events
X
=
a
, for all possible values of
a
that
X
can take on. When we have two random variables
X
and
Y
,
we can think of
(
X
,
Y
)
as a ”twodimensional” random variable, in which case the events of interest are
X
=
a
∧
Y
=
b
for all possible values of
(
a
,
b
)
that
(
X
,
Y
)
can take on. Thus, a natural generalization of the
notion of distribution to multiple random variables is the following.
Deﬁnition 18.1 (joint distribution)
: The joint distribution
of two discrete random variables
X
and
Y
is the
collection of values
{
(
a
,
b
,
Pr
[
X
=
a
∧
Y
=
b
])
:
(
a
,
b
)
∈
A
×
B
}
, where
A
and
B
are the sets of all possible
values taken by
X
and
Y
respectively.
This notion obviously generalizes to three or more random variables. Since we will write Pr
[
X
=
a
∧
Y
=
b
]
quite often, we will abbreviate it to Pr
[
X
=
a
,
Y
=
b
]
.
Just like the distribution of a single random variable, the joint distribution is
normalized
, i.e.
∑
a
∈
A
,
b
∈
B
Pr
[
X
=
a
,
Y
=
b
] =
1
.
This follows from noticing that the events
X
=
a
∧
Y
=
b
,
a
∈
A
,
b
∈
B
, partition the sample space.
The joint distribution between two random variables fully describe their statistical relationships, and pro
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 09/21/2010 for the course CS 70 taught by Professor Papadimitrou during the Spring '08 term at University of California, Berkeley.
 Spring '08
 PAPADIMITROU

Click to edit the document details