From our observed data we want to obtain p y miss y

• 270
• 100% (3) 3 out of 3 people found this document helpful

This preview shows page 124 - 126 out of 270 pages.

From our observed data we want to obtain p ( , , Y miss | Y obs ), the poste- rior distribution of unknown and unobserved quantities. A Gibbs sampling scheme for approximating this posterior distribution can be constructed by simply adding one step to the Gibbs sampler presented in the previous sec- tion: Given starting values { (0) , Y (0) miss } , we generate { ( s +1) , ( s +1) , Y ( s +1) miss } from { ( s ) , ( s ) , Y ( s ) miss } by 1. sampling ( s +1) from p ( | Y obs , Y ( s ) miss , ( s ) ) ; 2. sampling ( s +1) from p ( | Y obs , Y ( s ) miss , ( s +1) ) ; 3. sampling Y ( s +1) miss from p ( Y miss | Y obs , ( s +1) , ( s +1) ). Note that in steps 1 and 2, the fixed value of Y obs combines with the current value of Y ( s ) miss to form a current version of a complete data matrix Y ( s ) having
118 7 The multivariate normal model no missing values. The n rows of the matrix of Y ( s ) can then be plugged into formulae 7.6 and 7.9 to obtain the full conditional distributions of and . Step 3 is a bit more complicated: p ( Y miss | Y obs , , ) / p ( Y miss , Y obs | , ) = n Y i =1 p ( y i, miss , y i, obs | , ) / n Y i =1 p ( y i, miss | y i, obs , , ) , so for each i we need to sample the missing elements of the data vector condi- tional on the observed elements. This is made possible via the following result about multivariate normal distributions: Let y multivariate normal( , ), let a be a subset of variable indices { 1 , . . . , p } and let b be the complement of a . For example, if p = 4 then perhaps a = { 1 , 2 } and b = { 3 , 4 } . If you know about inverses of partitioned matrices you can show that { y [ b ] | y [ a ] , , } multivariate normal( b | a , b | a ) , where b | a = [ b ] + [ b,a ] ( [ a,a ] ) - 1 ( y [ a ] - [ a ] ) (7.10) b | a = [ b,b ] - [ b,a ] ( [ a,a ] ) - 1 [ a,b ] . (7.11) In the above formulae, [ b ] refers to the elements of corresponding to the indices in b , and [ a,b ] refers to the matrix made up of the elements that are in rows a and columns b of . Let’s try to gain a little bit of intuition about what is going on in Equations 7.10 and 7.11. Suppose y is a sample from our population of four variables glu , bp , skin and bmi . If we have glu and bp data for someone ( a = { 1 , 2 } ) but are missing skin and bmi measurements ( b = { 3 , 4 } ), then we would be interested in the conditional distribution of these missing measurements y [ b ] given the observed information y [ a ] . Equation 7.10 says that the conditional mean of skin and bmi start o at their unconditional mean [ b ] , but then are modified by ( y [ a ] - [ a ] ). For example, if a person had higher than average values of glu and bp , then ( y [ a ] - [ a ] ) would be a 2 1 vector of positive numbers. For our data the 2 2 matrix [ b,a ] ( [ a,a ] ) - 1 has all positive entries, and so b | a > [ b ] . This makes sense: If all four variables are positively correlated, then if we observe higher than average values of glu and bp , we should also expect higher than average values of skin and bmi . Also note that b | a is equal to the unconditional variance [ b,b ] but with something subtracted o , suggesting that the conditional variance is less than the unconditional variance. Again, this makes sense: having information about some variables should decrease, or at least not increase, our uncertainty about the others.
• • • 