This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: 5 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution. There are several reasons for this, one being the central limit theorem, and another being that the normal model is a simple model with separate parameters for the population mean and variance - two quan- tities that are often of primary interest. In this chapter we discuss some of the properties of the normal distribution, and show how to make posterior inference on the population mean and variance parameters. We also compare the sampling properties of the standard Bayesian estimator of the population mean to those of the unbiased sample mean. Lastly, we discuss the appro- priateness of the normal model when the underlying data are not normally distributed. 5.1 The normal model A random variable Y is said to be normally distributed with mean and variance 2 > 0 if the density of Y is given by p ( y | , 2 ) = 1 2 2 e- 1 2 ( y- ) 2 ,- < y < . Figure 5.1 shows normal density curves for a few values of and 2 . Some important things to remember about this distribution include that the distribution is symmetric about , and the mode, median and mean are all equal to ; about 95% of the population lies within two standard deviations of the mean (more precisely, 1.96 standard deviations); if X normal( , 2 ), Y normal( , 2 ) and X and Y are independent, then aX + bY normal( a + b,a 2 2 + b 2 2 ); the dnorm, rnorm, pnorm , and qnorm commands in R take the standard deviation as their argument, not the variance 2 . Be very careful about this when using R- confusing with 2 can drastically change your results. P.D. Hoff, A First Course in Bayesian Statistical Methods , Springer Texts in Statistics, DOI 10.1007/978-0-387-92407-6 5, c Springer Science+Business Media, LLC 2009 68 5 The normal model > dnorm function (x , mean = 0 , sd = 1 , log = FALSE) . Internal (dnorm(x , mean , sd , log )) 2 4 6 8 10 0.0 0.2 0.4 0.6 0.8 y p(y | , 2 ) = 2, 2 = 0.25 = 5, 2 = 4 = 7, 2 = 1 Fig. 5.1. Some normal densities. The importance of the normal distribution stems primarily from the cen- tral limit theorem, which says that under very general conditions, the sum (or mean) of a set of random variables is approximately normally distributed. In practice, this means that the normal sampling model will be appropriate for data that result from the additive effects of a large number of factors. Example: womens height A study of 1,100 English families from 1893 to 1898 gathered height data on n = 1375 women over the age of 18. A histogram of these data is shown in Figure 5.2. The sample mean of these data is y = 63 . 75 and the sample standard deviation is s = 2 . 62 inches. One explanation for the variability in heights among these women is that the women were heterogeneous in terms of a number of factors controlling human growth, such as genetics, diet, disease,...
View Full Document