4703-10-Notes-IS

4703-10-Notes-IS - Copyright c 2010 by Karl Sigman 1 Rare...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Copyright c 2010 by Karl Sigman 1 Rare event simulation and importance sampling Suppose we wish to use Monte Carlo simulation to estimate a probability p = P ( A ) when the event A is rare (e.g., when p is very small). An example would be p = P ( M k > b ) with a very large b for M k = max j k R k , the maximum over the first k steps of a random walk. We could naively simulate n (large) iid copies of A , denoted by A 1 ,A 2 ,...,A n , then set X i = I { A i } and use the crude estimate p p ( n ) = 1 n n X i =1 X i . (1) But this is not a good idea: def = E ( X i ) = P ( A ) = p and 2 def = V ar ( X i ) = p (1- p ) and so, since p is assumed very small, the ratio / = p p (1- p ) /p 1 / p- as p 0; relative to , is of a much larger magnitude. This is very bad since when constructing confidence intervals, p ( n ) z / 2 n , the length of the interval is in units of : If is much larger than what we are trying to estimate, , then the confidence interval will be way too large to be of any use. It would be like saying I am 95% confident that he weighs 140 pounds plus or minus 500 pounds. To make matters worse, increasing the number n of copies in the Monte Carlo so as to reduce the interval length, while sounding OK, could be impractical since n would end up having to be enormous. Importance sampling is a technique that gets around this problem by changing the proba- bility distributions of the model so as to make the rare event happen often instead of rarely. To understand the basic idea, suppose we wish to compute E ( h ( X )) = R h ( x ) f ( x ) dx for a continuous random variable X distributed with density f ( x ). For example, if h ( x ) = I { x > b } for a given large b , then h ( X ) = I { X > b } and E ( h ( X )) = P ( X > b ). Now let g ( x ) be any other density such that f ( x ) = 0 whenever g ( x ) = 0, and observe that we can re-write E ( h ( X )) = Z h ( x ) f ( x ) dx = Z h h ( x ) f ( x ) g ( x ) i g ( x ) dx = E h h ( X ) f ( X ) g ( X ) i , where E denotes expected value when g is used as the distribution of X (as opposed to the original distribution f ). In other words: If X has distribution g , then the expected value of h ( X ) f ( X ) g ( X ) is the same as what we originally wanted. The ratio L ( X ) = f ( X ) g ( X ) is called the likelihood ratio. We can write E ( h ( X )) = E ( h ( X ) L ( X )); (2) the left-hand side uses distribution f for X , while the right-hand side uses distribution g for X . 1 To make this work in our favor, we would want to choose g so that the variance of h ( X ) L ( X ) (under g ) is small relative to its mean. We can easily generalize this idea to multi-dimensions: Suppose h = h ( X 1 ,...,X k ) is real- valued where ( X 1 ,...,X k ) has joint density f ( x 1 ,...x k ). Then for an alternative joint density g ( x 1 ,...x k ), we once again can write E ( h ( X 1 ,...,X k )) = E ( h ( X 1 ,...,X k ) L ( X 1 ,...,X,....
View Full Document

Page1 / 8

4703-10-Notes-IS - Copyright c 2010 by Karl Sigman 1 Rare...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online