ch3 - Monte Carlo Methods with R: Monte Carlo Integration...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Monte Carlo Methods with R: Monte Carlo Integration [1] Chapter 3: Monte Carlo Integration “Every time I think I know what’s going on, suddenly there’s another layer of complications. I just want this damn thing solved.” John Scalzi The Last Colony This Chapter ◮ This chapter introduces the major concepts of Monte Carlo methods ◮ The validity of Monte Carlo approximations relies on the Law of Large Numbers ◮ The versatility of the representation of an integral as an expectation Monte Carlo Methods with R: Monte Carlo Integration [2] Monte Carlo Integration Introduction ◮ We will be concerned with evaluating integrals of the form h(x) f (x) dx, ⊲ f is a density X ⊲ We can produce an almost infinite number of random variables from f ◮ We apply probabilistic results ⊲ Law of Large Numbers ⊲ Central Limit Theorem ◮ The Alternative - Deterministic Numerical Integration ⊲ R functions area and integrate ⊲ OK in low (one) dimensions ⊲ Usually needs some knowledge of the function Monte Carlo Methods with R: Monte Carlo Integration [3] Classical Monte Carlo Integration The Monte Carlo Method ◮ The generic problem: Evaluate Ef [h(X )] = h(x) f (x) dx, X ⊲ X takes its values in X ◮ The Monte Carlo Method ⊲ Generate a sample (X1, . . . , Xn) from the density f ⊲ Approximate the integral with 1 hn = n n h (x j ) , j =1 Monte Carlo Methods with R: Monte Carlo Integration [4] Classical Monte Carlo Integration Validating the Monte Carlo Method ◮ The Convergence 1 hn = n n j =1 h (x j ) → h(x) f (x) dx = Ef [h(X )] X ⊲ Is valid by the Strong Law of Large Numbers ◮ When h2(X ) has a finite expectation under f , hn − Ef [h(X )] → N (0, 1) √ vn ⊲ Follows from the Central Limit Theorem ⊲ vn = 1 n2 n j =1 [ h (x j ) − h n ] 2 . Monte Carlo Methods with R: Monte Carlo Integration [5] Classical Monte Carlo Integration A First Example ◮ Look at the function ◮ h(x) = [cos(50x) + sin(20x)]2 ◮ Monitoring Convergence ◮ R code Monte Carlo Methods with R: Monte Carlo Integration [6] Classical Monte Carlo Integration A Caution ◮ The confidence band produced in this figure is not a 95% confidence band in the classical sense ◮ They are Confidence Intervals were you to stop at a chosen number of iterations Monte Carlo Methods with R: Monte Carlo Integration [7] Classical Monte Carlo Integration Comments ◮ The evaluation of the Monte Carlo error is a bonus ◮ It assumes that vn is a proper estimate of the variance of hn ◮ If vn does not converge, converges too slowly, a CLT may not apply Monte Carlo Methods with R: Monte Carlo Integration [8] Classical Monte Carlo Integration Another Example ◮ Normal Probability 1 ˆ Φ(t) = n n i=1 t Ixi≤t → Φ(t) = ⊲ The exact variance Φ(t)[1 − Φ(t)]/n ⊲ Conservative: Var ≈ 1/4n ⊲ For a precision of four decimals ⊲ Want 2 × 1/4n ≤ 10−4 simulations ⊲ Take n = (104 )2 = 108 ◮ This method breaks down for tail probabilities 1 2 √ e − y /2 d y 2π −∞ Monte Carlo Methods with R: Monte Carlo Integration [9] Importance Sampling Introduction ◮ Importance sampling is based on an alternative formulation of the SLLN Ef [h(X )] = X f (x ) h (X )f (X ) h (x ) g (x ) d x = E g ; g (x ) g (X ) ⊲ f is the target density ⊲ g is the candidate density ⊲ Sound Familiar? Monte Carlo Methods with R: Monte Carlo Integration [10] Importance Sampling Introduction ◮ Importance sampling is based on an alternative formulation of the SLLN h(x) Ef [h(X )] = X ⊲ f is the target density f (x) h(X )f (X ) ; g (x) dx = Eg g (x) g (X ) ⊲ g is the candidate density ⊲ Sound Familiar? – Just like Accept–Reject ◮ So 1 n n j =1 f (X j ) h(Xj ) → Ef [h(X )] g (X j ) ◮ As long as ⊲ Var (h(X )f (X )/g (X )) < ∞ ⊲ supp(g ) ⊃ supp(h × f ) Monte Carlo Methods with R: Monte Carlo Integration [11] Importance Sampling Revisiting Normal Tail Probabilities ◮ Z ∼ N (0, 1) and we are interested in the probability P (Z > 4.5) ◮ > pnorm(-4.5,log=T) [1] -12.59242 ◮ Simulating Z (i) ∼ N (0, 1) only produces a hit once in about 3 million iterations! ⊲ Very rare event for the normal ⊲ Not-so-rare for a distribution sitting out there! ◮ Take g = E xp(1) truncated at 4.5: e −y g (y ) = ∞ −x = e−(y−4.5) , 4. 5 e d x ◮ The IS estimator is n f (Y (i)) 1 1 = (i) ) n i=1 g (Y n n i=1 2 e−Yi /2+Yi−4.5 √ 2π R code Monte Carlo Methods with R: Monte Carlo Integration [12] Importance Sampling Normal Tail Variables ◮ The Importance sampler does not give us a sample ⇒ Can use Accept–Reject ◮ Sample Z ∼ N (0, 1), Z > a ⇒ Use Exponential Candidate √1 exp(−.5x2 ) 1 1 2π = √ exp(−.5x2 + x + a) ≤ √ exp(−.5a∗2 + a∗ + a) exp(−(x − a)) 2π 2π ⊲ Where a∗ = max{a, 1} ◮ Normals > 20 ◮ The Twilight Zone ◮ R code Monte Carlo Methods with R: Monte Carlo Integration [13] Importance Sampling Comments Importance sampling has little restriction on the choice of the candidate ◮ g can be chosen from distributions that are easy to simulate ⊲ Or efficient in the approximation of the integral. ◮ Moreover, the same sample (generated from g ) can be used repeatedly ⊲ Not only for different functions h but also for different densities f . Monte Carlo Methods with R: Monte Carlo Integration [14] Importance Sampling Easy Model - Difficult Distribution Example: Beta posterior importance approximation ◮ Have an observation x from a beta B (α, β ) distribution, x∼ Γ(α + β ) α−1 x (1 − x)β −1 I[0,1](x) Γ(α)Γ(β ) ◮ There exists a family of conjugate priors on (α, β ) of the form Γ (α + β ) π (α, β ) ∝ Γ(α)Γ(β ) where λ, x0, y0 are hyperparameters, λ β xα y 0 , 0 ◮ The posterior is then equal to π (α, β |x) ∝ Γ (α + β ) Γ(α)Γ(β ) λ+1 [xx0]α[(1 − x)y0]β . Monte Carlo Methods with R: Monte Carlo Integration [15] Importance Sampling Easy Model - Difficult Distribution -2 ◮ The posterior distribution is intractable π (α, β |x) ∝ Γ(α + β ) Γ(α)Γ(β ) λ+1 [xx0]α [(1 − x)y0 ]β . ⊲ Difficult to deal with the gamma functions ⊲ Simulating directly from π (α, β |x) is impossible. ◮ What candidate to use? ◮ Contour Plot ◮ Suggest a candidate? ◮ R code Monte Carlo Methods with R: Monte Carlo Integration [16] Importance Sampling Easy Model - Difficult Distribution – 3 ◮ Try a Bivariate Student’s T (or Normal) ◮ Trial and error ⊲ Student’s T (3, µ, Σ) distribution with µ = (50, 45) and Σ= 220 190 190 180 ⊲ Produce a reasonable fit ⊲ R code ◮ Note that we are using the fact that X ∼ f (x ) ⇒ Σ 1 / 2 X + µ ∼ f (x − µ )′ Σ − 1 (x − µ ) Monte Carlo Methods with R: Monte Carlo Integration [17] Importance Sampling Easy Model - Difficult Distribution – Posterior Means ◮ The posterior mean of α is απ (α, β |x)dαdβ = π (α, β |x) 1 α g (α, β )dαdβ ≈ g (α, β ) M M αi i=1 where ⊲ π (α, β |x) ∝ Γ(α +β ) Γ(α)Γ(β ) λ+1 [xx0]α [(1 − x)y0 ]β ⊲ g (α, β ) = T (3, µ, Σ) ◮ Note that π (α, β |x) is not normalized, so we have to calculate απ (α, β |x)dαdβ ≈ π (α, β |x)dαdβ π (αi ,βi |x) M i=1 αi g (αi ,βi ) M π (αi ,βi |x) i=1 g (αi ,βi ) ◮ The same samples can be used for every posterior expectation ◮ R code π (αi, βi|x) g (αi, βi) Monte Carlo Methods with R: Monte Carlo Integration [18] Importance Sampling Probit Analysis Example: Probit posterior importance sampling approximation ◮ y are binary variables, and we have covariates x ∈ Rp such that Pr(y = 1|x) = 1 − Pr(y = 0|x) = Φ(xTβ ) , β ∈ Rp . ◮ We return to the dataset Pima.tr, x=BMI ◮ A GLM estimation of the model is (using centered x) >glm(formula = y ~ x, family = binomial(link = "probit")) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.44957 0.09497 -4.734 2.20e-06 *** x 0.06479 0.01615 4.011 6.05e-05 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 So BMI has a significant impact on the possible presence of diabetes. Monte Carlo Methods with R: Monte Carlo Integration [19] Importance Sampling Bayesian Probit Analysis ◮ From a Bayesian perspective, we use a vague prior ⊲ β = (β1, β2) , each having a N (0, 100) distribution ◮ With Φ the normal cdf, the posterior is proportional to n yi i=1 [Φ(β1 + (xi − x)β2] [Φ(−β1 − (xi − x)β2 ] ¯ ¯ 1− yi ×e β 2 +β 2 2 − 21 100 × ◮ Level curves of posterior ◮ MLE in the center ◮ R code Monte Carlo Methods with R: Monte Carlo Integration [20] Importance Sampling Probit Analysis Importance Weights ◮ Normal candidate centered at the MLE - no finite variance guarantee ◮ The importance weights are rather uneven, if not degenerate ◮ Right side = reweighted candidate sample (R code) ◮ Somewhat of a failure ...
View Full Document

This note was uploaded on 01/29/2012 for the course STAT 6866 taught by Professor Womack during the Fall '11 term at University of Florida.

Ask a homework question - tutors are online