Introduction to Inferential Statistics Statistics is based on observations of large data sets. In this reading we will make some observations on large data sets and then apply them to small data sets (samples). DISTRIBUTION CURVE FOR LARGE DATA SETS a. Earlier we saw that the mathematical equation for a normal distribution curve is y = e " ( x i " μ ) 2 / 2 # 2 \$ Equation 1 b. The equation for the normal distribution curve can be simplified by introducing a new variable, z. z = x i " Equation 2 making: y = e " z 2 / 2 Equation 3 c. Figure 1 shows that the introduction of z also changes the normal distribution curve to the standard normal distribution curve in which the x- axis is given in z-scores or distances from the mean (which is now 0) in standard deviation units. Figure 1. Standard normal distribution curve. Eeling, D.L., Introduction to the normal distribution, http://www.comfsm.fm/%7Edleeling/statistics/notes007.html , accessed 6/5/10. d. Integration of the equation for the standard normal distribution curve between limits of any two z-scores makes it possible to determine the area under the total curve and the area under any portion of the curve. The

ratio of these areas is then the probability of x being within that span. The percent probabilities are referred to as l evels of confidence . Figure 2 shows those probabilities. Figure 2. Fractional probabilities between z values on a standard normal distribution curve. Eeling, D.L., Introduction to the normal distribution, http://www.comfsm.fm/%7Edleeling/statistics/notes007.html , accessed 6/5/10.
