note10_regression

# Posterior predictive simulation analytic form of the

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ledge of β and σ 2 is summarized by our posterior distribution. First draw (β, σ 2 ) from ˜ their joint posterior distribution, then draw y ∼ N (Xβ, σ 2 I ). ˜ • Posterior predictive simulation: • Analytic form of the posterior predictive distribution: ˜ˆ p(˜|y ) is multivariate t with location X β , square scale matrix y 2 T −1 ˜ ˜ s (I + X (X X ) X ), and n − k degrees of freedom. & % ' \$ Model checking and robustness • Suppose one simulates many samples y1 , . . . , yn from the ˜ ˜ posterior predictive distribution conditional on the same covariate vectors, x1 , . . . , xn used to simulate the data. Slide 6 • To judge if a particular response value yi is consistent with the tted model, one looks at the position of yi relative to the histogram of simulated values of yi from the corresponding ˜ predictive distribution. • If yi is in the tail of the distribution, that indicates that this observation is a potential outlier. & % MATH-440 Linear Regression ' \$ Example Measurements on breeding pairs of land-bird species were collected from 16 islands around Britain over the course of several decades. The dataset birdextint.txt contains the following variables for each species: • TIME: the average time of extinction of the species on the Slide 7 island where it appeared • NESTING: the average number of nesting pairs • SIZE: the size of the species (0=small or 1=large) • STATUS: the migratory status of the species (0=migrant or 1=resident) The objective is to t a model that relates the time of extinction of the bird species to the covariates. & ' Slide 8 % \$ setwd("H:/Math440") bird = read.table("birdextinct.txt", header=T, sep="\t") attach(bird) hist(TIME) The distribution of the outcome variable, TIME, is strongly right-skewed. Let's transform it to the log-scale: LOGTIME = log(TIME) hist(LOGTIME) & % MATH-440 Linear Regression ' \$ 0 10 20 Slide 9 30 Frequency 40 50 Histogram of TIME 0 10 20 30 40 50 60 TIME & % ' \$ 8 0 2 4 Slide 10 6 Frequency 10 12 14 Histogram of LOGTIME 0 1 2 3 4 LOGTIME & % MATH-440 Linear Regression ' \$ Let us look at the relationship between LOGTIME and the three predictor variables. Slide 11 plot(NESTING, LOGTIME) out = (LOGTIME > 3) text(NESTING[out], LOGTIME[out], label=SPECIES[out], pos=2) plot(ji...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online