Linear Regression
Response or dependent variable: Y
Predictors or independent variables: X1 , X2 , . . . , Xp
GOALS:
Exploring p(y |x) as a function of x
Understanding the mean of Y as a function of x
Mercury in Bass
High levels of mercury in sh known to cause health
problems, especially in children. Nicholas School study of
mercury concentrations in largemouth bass from the
Wacamaw and Lumber rive
Bayesian approach to simple regression
ind
Model: Yi N (0 + 1 xi , 2 ), where i = 1, . . . , n
Semi-conjugate priors for (0 , 1 , 2 ):
(0 , 1 ) M V N2 (0 , 1 ), 0 )
0 is a 2 2 matrix of variances and
Multivariate Normal Distribution
Our next major topics include hierarchical models and
regression (Chapter 8 and 9 in Hoff)
For Bayesian regression, we need the multivariate
normal (MVN) distribution
Bayesian Finite Population Inference
Population of N individuals.
Take a random sample (without replacement) of n
individuals.
Let yi the the value of a survey variable Y for individual
i, where i = 1
Modeling Named Storms in Atlantic
World Meteorological Association names tropical
storms and hurricanes
Has distribution of storms changed over time? If so,
when?
Data collected on number of named sto
Convergence to Posterior Distribution
Theory proves that if a Gibbs sampler iterates enough, the
draws will be from the joint posterior distribution (called the
target or stationary distribution). Con
Monte Carlo Sampling
We have seen that Monte Carlo sampling is a useful
tool for sampling from prior and posterior distributions
By limiting attention to conjugate prior distributions, all
models have
The Pygmalion Study
Do teachers expectations impact academic development of
children?
Researchers gave IQ test to elementary school children
They randomly picked six children and told teachers
that th
Posterior Integration
Suppose we want posterior distribution of function of , say
= g ( )
For expectation of , we have
p( | Y )d =
g ()
g ()p( | Y )d
What if we do not know how to compute the integra
Volcano Eruptions
What is the distribution of waiting times for volcanos?
Data on duration between eruptions (in months) of the
Mauna Loa volcano on Hawaii between 1832 and 1952.
n = 36 observations;
Teenagers and Televisions
In 1998, the New York Times and CBS News polled 1048
randomly selected 13 - 17 year olds to ask them if they had
a television in their room.
n = 1048 sampled teenagers
y = 69
Motivating Bayesian Inference
Signicance tests and condence intervals are forms of
classical or frequentist inference.
When might classical inference be inadequate?
Suppose you ip a coin (with unknown
STA 122/290 course web page
www.stat.duke.edu/ jerry/sta122/sta122n290f11.html
Access via Sakai
Access via my home page
Access via department web pages (courses in fall 2011)
p. 1/
What is modern sta
Examples of poor study designs
For STA 122/290
Prof. Jerry Reiter
Problematic survey design
Literary Digest calls 1936 election wrong
Questionnaires mailed to 10 million people. Addresses
collected f