This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Chapter 3 Building Infinite Processes from Regular Conditional Probability Distributions Section 3.1 introduces the notion of a probability kernel, which is a useful way of systematizing and extending the treatment of conditional probability distributions you will have seen in 36-752. Section 3.2 gives an extension theorem (due to Ionescu Tulcea) which lets us build infinite-dimensional distributions from a family of finite-dimensional distributions. Rather than assuming topolog- ical regularity of the space, as in Section 2.2, we assume that the FDDs can be derived from one another recursively, through applying probability kernels. This is the same as assuming regularity of the appropriate conditional probabilities. 3.1 Probability Kernels Definition 30 (Probability kernel) A probability kernel from a measurable space Ξ , X to another measurable space Υ , Y is a function κ : Ξ × Y → [0 , 1] such that 1. for any Y ∈ Y , κ ( x, Y ) is X-measurable; and 2. for any x ∈ Ξ , κ ( x, Y ) ≡ κ x ( Y ) is a probability measure on Υ , Y . We will write the integral of a function f : Υ → R , with respect to this measure, as f ( y ) κ ( x, dy ) , f ( y ) κ x ( dy ) , or, most compactly, κf ( x ) . If condition 1 is satisfied and, for fixed x , κ ( x, Y ) is a measure but not a prob- ability measure, then κ is called a measure kernel or even just a kernel . 13 CHAPTER 3. BUILDING PROCESSES BY CONDITIONING 14 Notice that we can represent any distribution on Ξ as a kernel where the first argument is irrelevant: κ ( x 1 , Y ) = κ ( x 2 , Y ) for all x 1 , x 2 ∈ Ξ. The “kernels” in kernel density estimation are probability kernels, as are the stochastic transition matrices of Markov chains. (The kernels in support vector machines, however, generally are not.) Regular conditional probabilities, which you will remember from 36-752, are all probability kernels. This fact suggests how we define the composition of kernels. Definition 31 (Composition of probability kernels) Let κ 1 be a kernel from Ξ to Υ , and κ 2 a kernel from Ξ × Υ to Γ . Then we define κ 1 ⊗ κ 2 as the kernel from Ξ to Υ × Γ such that ( κ 1 ⊗ κ 2 )( x, B ) = κ 1 ( x, dy ) κ 2 ( x, y, dz ) 1 B ( y, z ) for every measurable B ⊆ Υ × Γ (where z ranges over the space Γ )....
View Full Document