Unformatted text preview: ledge of β and σ 2 is summarized by our posterior distribution. First draw (β, σ 2 ) from
˜
their joint posterior distribution, then draw y ∼ N (Xβ, σ 2 I ).
˜ • Posterior predictive simulation: • Analytic form of the posterior predictive distribution: ˜ˆ
p(˜y ) is multivariate t with location X β , square scale matrix
y
2
T
−1 ˜
˜
s (I + X (X X ) X ), and n − k degrees of freedom.
& % ' $ Model checking and robustness • Suppose one simulates many samples y1 , . . . , yn from the
˜
˜ posterior predictive distribution conditional on the same
covariate vectors, x1 , . . . , xn used to simulate the data. Slide 6 • To judge if a particular response value yi is consistent with the
tted model, one looks at the position of yi relative to the
histogram of simulated values of yi from the corresponding
˜ predictive distribution. • If yi is in the tail of the distribution, that indicates that this observation is a potential outlier. & % MATH440 Linear Regression ' $ Example Measurements on breeding pairs of landbird species were collected
from 16 islands around Britain over the course of several decades.
The dataset birdextint.txt contains the following variables for
each species:
• TIME: the average time of extinction of the species on the
Slide 7 island where it appeared • NESTING: the average number of nesting pairs
• SIZE: the size of the species (0=small or 1=large)
• STATUS: the migratory status of the species (0=migrant or 1=resident) The objective is to t a model that relates the time of extinction of
the bird species to the covariates.
&
' Slide 8 %
$ setwd("H:/Math440")
bird = read.table("birdextinct.txt", header=T, sep="\t")
attach(bird)
hist(TIME) The distribution of the outcome variable, TIME, is strongly
rightskewed. Let's transform it to the logscale:
LOGTIME = log(TIME)
hist(LOGTIME) & % MATH440 Linear Regression ' $ 0 10 20 Slide 9 30 Frequency 40 50 Histogram of TIME 0 10 20 30 40 50 60 TIME & % ' $ 8
0 2 4 Slide 10 6 Frequency 10 12 14 Histogram of LOGTIME 0 1 2 3 4 LOGTIME & % MATH440 Linear Regression ' $ Let us look at the relationship between LOGTIME and the three
predictor variables. Slide 11 plot(NESTING, LOGTIME)
out = (LOGTIME > 3)
text(NESTING[out], LOGTIME[out], label=SPECIES[out], pos=2)
plot(ji...
View
Full
Document
This note was uploaded on 02/22/2013 for the course MATH 440 taught by Professor Tadesse during the Spring '13 term at Georgetown.
 Spring '13
 Tadesse
 Linear Regression

Click to edit the document details