3
AXIOMS OF PROBABILITY
11
3
Axioms of Probability
The question here is: how can we mathematically define a random experiment? What we have
are
outcomes
(which tell you exactly what happens),
events
(sets comprising certain outcomes),
and
probability
(which attaches to every event the likelihood that it happens).
We need to
agree on which properties these objects must have in order to compute with them and develop
a theory.
When we have finitely many equally likely outcomes all is pretty clear and we have already
seen many examples. However, as is common in mathematics, infinite sets are much harder to
deal with. For example, we will see soon what it means to choose a random point within a unit
circle. On the other hand, we will also see that there is no way to choose at random a positive
integer — remember that “at random” means all choices are equally likely, unless otherwise
specified.
A
probability space
is then a triple (Ω
,
F
, P
).
The first object Ω is an arbitrary set of
outcomes, sometimes called a
sample space
.
The second object
F
is the collection of all events, that is a set of subsets of Ω. Therefore,
A
∈ F
necessarily means that
A
⊂
Ω. Can we just say that each
A
⊂
Ω is an event? In this
course,
you can assume so without worry, although there are good reasons for not assuming so
in general
! I will give the definition of what properties
F
needs to satisfy, but this is just for
illustration and you should take a course in measure theory to understand what is really going
on. Namely,
F
needs to be a
σ
-algebra
, which means (1)
∅ ∈ F
, (2)
A
∈ F
=
⇒
A
c
∈ F
, and
(3)
A
1
, A
2
,
· · · ∈ F
=
⇒ ∪
∞
i
=1
A
i
∈ F
.
What is important is that you can take complement
A
c
of an event
A
(i.e.,
A
c
happens when
A
does not happen), unions of two or more events (i.e.,
A
1
∪
A
2
happens when either
A
1
or
A
2
happens), and intersection of two or more events (i.e.,
A
1
∩
A
2
happens when both
A
1
and
A
2
happen). We call events
A
1
, A
2
, . . .
pairwise disjoint
if
A
i
∩
A
j
=
∅
if
i
negationslash
=
j
— that is, at most
one of such events can occur.
Finally, the probability
P
is a number attached to every event
A
and satisfies the following
three axioms:
Axiom 1
. For every event
A
,
P
(
A
)
≥
0.
Axiom 2
.
P
(Ω) = 1.
Axiom 3
. If
A
1
, A
2
, . . .
is a sequence of pairwise disjoint events, then
P
(
∞
uniondisplay
i
=1
A
i
) =
∞
summationdisplay
i
=1
P
(
A
i
)
.
Whenever we have an abstract definition like this, the first thing to do is to look for examples.
Here are some.