*This preview shows
pages
1–3. Sign up
to
view the full content.*

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
**Unformatted text preview: **Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty. Through this class, we will be relying on concepts from probability theory for deriving machine learning algorithms. These notes attempt to cover the basics of probability theory at a level appropriate for CS 229. The mathematical theory of probability is very sophisticated, and delves into a branch of analysis known as measure theory . In these notes, we provide a basic treatment of probability that does not address these finer details. 1 Elements of probability In order to define a probability on a set we need a few basic elements, • Sample space Ω : The set of all the outcomes of a random experiment. Here, each outcome ω ∈ Ω can be thought of as a complete description of the state of the real world at the end of the experiment. • Set of events (or event space ) F : A set whose elements A ∈ F (called events ) are subsets of Ω (i.e., A ⊆ Ω is a collection of possible outcomes of an experiment). 1 . • Probability measure : A function P : F → R that satisfies the following properties,- P ( A ) ≥ , for all A ∈ F- P (Ω) = 1- If A 1 ,A 2 ,... are disjoint events (i.e., A i ∩ A j = ∅ whenever i 6 = j ), then P ( ∪ i A i ) = X i P ( A i ) These three properties are called the Axioms of Probability . Example : Consider the event of tossing a six-sided die. The sample space is Ω = { 1 , 2 , 3 , 4 , 5 , 6 } . We can define different event spaces on this sample space. For example, the simplest event space is the trivial event space F = {∅ , Ω } . Another event space is the set of all subsets of Ω . For the first event space, the unique probability measure satisfying the requirements above is given by P ( ∅ ) = 0 ,P (Ω) = 1 . For the second event space, one valid probability measure is to assign the probability of each set in the event space to be i 6 where i is the number of elements of that set; for example, P ( { 1 , 2 , 3 , 4 } ) = 4 6 and P ( { 1 , 2 , 3 } ) = 3 6 . Properties :- If A ⊆ B = ⇒ P ( A ) ≤ P ( B ) .- P ( A ∩ B ) ≤ min( P ( A ) ,P ( B )) .- (Union Bound) P ( A ∪ B ) ≤ P ( A ) + P ( B ) .- P (Ω \ A ) = 1- P ( A ) .- (Law of Total Probability) If A 1 ,...,A k are a set of disjoint events such that ∪ k i =1 A i = Ω , then ∑ k i =1 P ( A k ) = 1 . 1 F should satisfy three properties: (1) ∅ ∈ F ; (2) A ∈ F = ⇒ Ω \ A ∈ F ; and (3) A 1 ,A 2 ,... ∈ F = ⇒ ∪ i A i ∈ F . 1 1.1 Conditional probability and independence Let B be an event with non-zero probability. The conditional probability of any event A given B is defined as, P ( A | B ) , P ( A ∩ B ) P ( B ) In other words, P ( A | B ) is the probability measure of the event A after observing the occurrence of event B . Two events are called independent if and only if P ( A ∩ B ) = P ( A ) P ( B ) (or equivalently, P ( A | B ) = P ( A ) ). Therefore, independence is equivalent to saying that observing B...

View
Full
Document