lecture6

# lecture6 - ISYE 2028 A and B Lecture 6 Canonical Continuous...

This preview shows pages 1–4. Sign up to view the full content.

ISYE 2028 A and B Lecture 6 Canonical Continuous Random Variables and some brief results Dr. Kobi Abayomi February 10, 2009 1 Introduction - The Normal Distribution, Normal Data Remember that continuous data is data that can assume an uncountable number of values. We cannot list (in a frequency table , for example) the possible values of a continuous variable – the list would be inﬁnitely long. If you think for just a second you can probably (that’s supposed to be funny!) convince yourself that the relative frequency of any one value of a continuous variable is very low when there are many observations. For a continuous variable we only determine the relative frequency for a range of values (Remember the histogram and how we bin values for discrete distributions). 1.1 What’s normal? Let’s use a toy example: say we have the discrete distribution of clown shoe sizes. Let’s say this is the distribution. 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Shoe Size Count Relative Frequency Cumulative Relative Frequency 17 1 1/7 1/7 18 1 1/7 2/7 19 1 1/7 3/7 20 2 2/7 5/7 22 1 1/7 6/7 24 1 1/7 7/7 These data are discrete and that the relative frequency sums to one. Let’s look at an illustration of the distribution of this data: Histogram of Observed Shoe Size x Density 17 18 19 20 21 22 23 24 0.00 0.05 0.10 0.15 0.20 0.25 Figure 1 Histogram of observed shoe size. Remember that the area of the histogram sums to one when the y-axis is density (2 * . 30 + 3 * . 14 1) . Notice how the observed data are binned. For this example, we’ll use R : x<-c(22,24,19,17,20,18,20) hist(x,breaks=7,col="red",density=20,main="Histogram of Observed Shoe Size",freq=FALSE) mean(x); mean(x^2) The mean of our observed data, the shoe sizes is μ = 20 What if, for some reason (indulge this example), we could only observe some of the data - never all of it at once. Say we can take only four observations at a time. 2
For example say we observe x 1 = { 22 , 19 , 18 , 20 } . An estimate of the mean — or sample mean, x = n - 1 x i — based on this subsample is x 1 = 19 . 75. If we could not observe all of the data at once, we might want to resample , again and again, and get x 2 , x 3 , x 4 , ...etc.,etc. It is reasonable to expect the mean of these resampled means to be very close to x = 20. As we take more and more resamples we can say that, almost surely, the mean of the sample means is 20. Let’s illustrate this using R samplesize<-4 #Here I’m going to take a sample of size 4 from #the clown shoe sizes numsamples<-12500 #Here I’m going to take a lot of these #samples. Over and over and. ... xmatrix<-matrix(0,nrow=numsamples,ncol=samplesize) #first a place to put all #of these samples so we can compute means #and graph the results for(i in 1:numsamples){ xmatrix[i,]<-sample(x,samplesize) } #a simple loop to take the sample #and put it in our new data table hist(apply(xmatrix,1,mean),breaks=20) #a graph of the results These resampled means, which are now continuous variables 1 have a distribution of their

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 11/08/2009 for the course ISYE 2028 taught by Professor Shim during the Spring '07 term at Georgia Tech.

### Page1 / 18

lecture6 - ISYE 2028 A and B Lecture 6 Canonical Continuous...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online