stats 363 Chapter-7 notes

stats 363 Chapter-7 notes - CHAPTER 7 SAlVIPLING...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
Background image of page 9
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CHAPTER 7 SAlVIPLING DISTRIBUTION 7.2 Sampling Plans and Experimental Designs The way a sample is selected is called the sampling plan or experimental design. Simple random sampling is commonly used sampling plan in which every sample of size n has the same chance of being selected. Four most commonly used sampling plans are given as follows. Definition 1. If a sample of n elements is selected from a population of N elements using a plan in which each of the possible samples has the equal chance of selection, then the sampling is said to be random and the result sample is a simple random sample. Example 1. Suppose we want to select a sample of size n = 2 from a population containing N = 4 objects (say, A, B, C, and D). There are six distinct samples that could be selected, as listed in the following table. Sample Observations in Sample 1 A, B A, C A, D B, C B, D C', D monsoon: If each of these six samples has the equal chance of being selected, given by 1/6, then the resulting sample is called a simple random sample, or just a random sample. In general, we have the following definition.- The selection of a simple random sample can be done by using random numbers - dig— its generated so that the values 0 to 9 occur randomly and with equal frequency. Another method is to let computer generates random numbers for sampling. Definition 2. When the population consists of two or more subpopulations, called strata, a sampling plan that ensures. that a simple random sample is selected from each subpopula— tion is called a stratified random sample. Example 2. Suppose a public opinion poll designed to estimate the proportion of voters who favor spending more tax revenue on an improved ambulance service is to be conducted in a certain county. The county contains two cities and a rural area. The population ele- ments of interest for the poll are all men and women of voting age who reside in the county. A stratified random sample of adults residing in the county can be obtained by selecting a simple random sample of adults from each city and another simple random sample of adults from the rural area. In this case, the two cities and the rural area represents three strata from which simple random sample are selected.- The principal reasons for using stratified random sampling rather than simple random sampling are as follows: 1. Stratification may produce a smaller sampling error than would be produced by a simple random sample of the same size. This result is particularly true if measurements within strata are homogeneous. 2. The cost per observation in the survey may be reduced by stratification of the popu— lation elements into convenient grouping. Definition 3. When the available sampling units are groups of elements, called clusters, a cluster sample is a simple random sample of clusters from the available clusters in the population. Example 3. To estimate the average income per household in a large city, how should they choose the sample? If they use simple random sampling, they will need a frame list— ing all households (elements) in the city, and this frame may be very costly or impossible to obtain. They cannot avoid this problem by using stratified random sampling because a frame is still required for each stratum in the population. Rather than draw a simple ran— dom sample of elements, they could divide the city into regions such as blocks (or clusters of elements) and select a simple random sample of blocks from the population. This task is easily accomplished by using a frame that lists all city blocks. Then the income of every household within each sampled block could be measured.- Definition 4. A Lin—k: systematic random sample involves the random selection of one of the first k elements in an ordered population, and then the systematic selection of every kth element thereafter. A systematic sample is generally spread more uniformly over the entire population and thus may provide more information about the population than an equivalent amount of data contained in a simple random sample. Example 4. Suppose we wish to select a 1—in—5 systematic sample of travel vouchers from a stack of N = 1000 (that is, sample n = 200 vouchers) to determine the proportion of vouchers filed correctly. A voucher is drawn at random from the first five vouchers (for instance, number 2), and every fifth voucher thereafter is included in the sample. Suppose that most of the first 500 vouchers have been correctly filed, but because of a change in clerk, the second 500 have all been incorrectly filed. Simple random sampling could accidentally select a large number (perhaps all) of the 200 vouchers from either the first or the second 500 vouchers and hence yield a very poor estimate of the true proportion of correct filing. In contrast, the systematic sampling would select an equal number of vouchers from each of the two groups and would give a very accurate estimate of the fraction of vouchers correctly filed. I 7.3 Statistics and Sampling Distribution When we select a random sample from a population, the numerical descriptive measures, such as mean, standard deviation, and so on, calculated from the sample is referred to as statistics. These statistics vary or change for each different random sample we select; that is, they are random variables. The probability distributions for statistics are called sampling distributions because, in repeated sampling, they provide this information: * What value of the statistic can occur. * How often each value occur. Definition 5.. The sampling distribution of a statistic is the probability distribution for the possible values of the statistic that results when random samples of size n are repeated drawn from the population. There are three ways of finding the sampling distribution of a statistic: 1. Derive the distribution mathematically using the laws of probability. 2. Use simulation to approximate the distribution. That is, draw a large number of samples of size n, calculating the value of the statistic for each sample, and tabulate the results in a relative frequency histogram. When the number of samples is large, the histogram will be very close to the theoretical sampling distribution. 3.. Use statistical theorems to derive exact or approximate sample distribution. Example 5,. Suppose a population consists of N = 5 numbers: 3, 6, 9, 12, 15. If a random sample of size n = 3 is selected without replacement, find the sample distribution for (a) the sample mean E, (b) the sample median m. Solution. All possible random samples of size n = 3 and their corresponding means and medians are given below. Sample Observations in Sample Sample Mean Sample Median 1 3, 6, 9 6 6 2 3, 6, 12 7 6 3 3, 6, 15 8 6 4 3, 9, 12 8 9 5 3, 9, 15 9 9 6 3,12,15 10 12 7 6, 9, 12 9 9 8 6, 9, 15 10 9 9 6,1215 11 12 10 9,1215 12 12 (a) The sample distribution for the sample mean E is given by That is, nun (b) The sample distribution for the sample median m is given by 3 P = 6 = — = 0,3 {m } 10 required 4 P = = — = .4 {m 9} 10 0 That is, Mm) 03 I Note. It is usually very difficult to derive sampling distributions by the method described in the preceding example. When this method is no longer feasible, we may have to use one of these methods: * Use a simulation to approximate the sampling distribution empirically. * Rely on statistical theorems and theoretical results. 7.4 The Central Limit Theorem The Central Limit Theorem states that, under rather general conditions, sums and means of random samples of measurements drawn from a population tend to have an approximately 4 normal distribution Consider an experiment of tossing a balanced die n times. Let E denote the mean of the numbers on the n upper faces. If we use computer software to generate and depict the histograms of the sampling distribution of T for n = 2, n = 3, n = 4, and so on, we will amazingly find that the shape of these histograms looks closer and closer like the standard normal curve as n becomes larger and larger. Theorem 1 (Central Limit Theorem). If random samples of n observations are drawn from a nonnormal population with finite mean ,a and standard deviation (7, then, when n is large, the sampling distribution of the sample mean E is approximately normally distributed, with mean a and standard deviation 0/ The approximation becomes more accurate as n becomes large. Example 6. Achievement test scores of all high school seniors in a certain state have mean a = 60 and variance 02 = 64. A random sample of n = 100 students from a large high school had a mean score of 58.. Is there evidence to suggest that this high school is inferior? Solution. Let E denote the mean of a random sample of n = 100 scores from a population with a = 60 and 02 = 64.. We wish to calculate the probability that the sample mean 5 is at most 58, namely, P {T g 58}. By the Central Limit Theorem, it follows that PW g 58} z P{z g —2..5} = 0.0062 where the standardized value of the mean score 58 is calculated as 58 — 60 _ 8/ x/ 100 Since this probability is exceedingly small, it is unlikely that any peer high school will produce the mean score lower than 58.. This evidence suggests that the average score for this high school is inferior.- —24,5. 7.5 The Sampling distribution of the Sample Mean Theorem 1 (The Sampling distribution of the Sample Mean E) ‘k If a random sample of n measurements is selected from a population with mean ,u and standard deviation 0, the sampling distribution of the sample mean 5 will have mean ,a and standard deviation 0/ * If the population has a normal distribution, the sampling distribution of the sample mean ‘95 will be exactly normally distributed with mean ,a and standard deviation 0/ * If the population distribution is nonnormal, the sampling distribution of the sample mean E will be approximately normally distributed, with mean a and standard deviation a/fi, for large samples (by the Central Limit Theorem). Definition 6. The standard deviation of a statistic used as an estimator of a populatiOn parameter is also called the standard error of the estimator (abbreviated SE) because it 5 refers to the precision of the estimator. Therefore, the standard deviation of E — given by a/fi — is referred to as the standard error of the mean, abbreviated as SE or just SE. Example 7. The duration of Alzheimer’s disease from the onset of symptoms until death ranges from 3 to 20 years; the average is 8 years with a standard deviation of 4 years. The administrator of a large medical center randomly selects the medical records of 30 deceased Alzheimer’s patients from the medical center’s database and records the average duration Find the approximate probability that the average (a) is less than 7 years, (b) exceeds 7 years, (0) lies within 1 year of the population mean ,a = 8. Solution. The standard error is 4 — = 0.73.. (7 x/fi T 7% (a) To find the probability that the average is less than 7 years, we need to calculate the standardized value of 7: 7 8 = —1. . 0.73 37 Then the desired probability is P{E < 7} m P{z < 4.37} = 0.0853. (b) The probability that the average exceeds 7 years is P{§:‘ > 7} % P{z > —1.37} = 1— 0.0853 2 0.9147. (c) To find the probability that the average lies within 1 year of the population mean a = 8, we need to calculate the standardized values of 7 and 9: 7 * 8 9 — 8 —————— = — a — = 1” ‘_ 0’73 1 37 and 0073 37 Then the required probability is 22 P{7 < E <9} P{—1.37 < z < 1.37} P{z < 1.37} — P{z < —1.37} 0.9147 — 0.0853 0.8294. Example 8.. To avoid difficulties with the Federal Trade Commission or state and local consumer protection agencies, a beverage bottler must make reasonably certain that 12—ounce 6 bottles actually contain 12 ounces of beverage. To determine whether a bottling machine is working satisfactorily, one bottler randomly samples 30 bottles per hour and measures the amount of beverage in each bottle. The mean E of the 30 fill measurements is used to decide whether to readjust the amount of beverage delivered per bottle by the filling machine. If records show that the amount of fill per bottle is normally distributed, with a standard deviation of 03 ounces, and if the bottling machine is set to produce a mean fill per bottle of 12 ounces, what is the approximate probability that the sample mean T of the 30 test bottles is less than 11.99 ounces? Solution. The standard error is _0_ Z = 0.055, Wm To find the probability that the sample mean of the 10 test bottles is less than 12 ounces, we need to calculate the standardized value of 119: 11.9 — 12 = —1, 2., 04,055 8 The required probability is then P {E <11,.9} = P {z < —182} = 0,0344, Since this probability is very small, the company should not have difficulties with the Federal Trade Commission or state and local consumer protection agencies.- Example 9. An electronic firm manufacturers light bulbs that have a length of life with mean 800 hours and a standard deviation of 80 hours. Find the probability that a random sample of 64 bulbs will have an average life of greater than 77 5 hour. Solution. The standard error is Lzflflou x/fix/671 To find the probability that the sample mean of the 64 bulbs is greater than 775 hours, we need to calculate the standardized value of 775: 775 — 800 _ 10 ” —2,,5, The required probability is then Pg > 775} m P{z > —2.5} = 1 — P {z < —2.5} = 1 — 00062 = 0,9938. HOMEWORK: pp..273 — 274 7,19, 7,24, 71,29, 7,30, 71,31, 7,33 7.6 The Sampling Distribution of the Sample Proportion Let x be a binomial random variable with n trials and probability p of success. Here the parameter p can also be referred to as the population proportion of success. Since it represents the number of successes in n trials, the sample proportion of success x 23:— 77; will be used to estimate of the population proportion p. The binomial random variable 5t has mean ,a = np and standard deviation 0 = Since fiis simply the value of :5, expressed as a proportion (13 = i), the sampling distribution of ii is identical to the probability distribution of :17, except that it has a new scale along the horizontal axis. Because of this change of scale, the mean and standard deviation of p are also rescaled, so that the mean of the sampling distribution is p and its standard error is SEQ/i): —— whereq=1—p. Just as we can approximate the probability distribution of the binomial random variable 5t with a normal distribution when the sample size n is large, we can do the same with the sampling distribution of 3?. Theorem 2 (Properties of the Sampling Distribution of the Sample Proportion If a random sample of n observations is drawn from a binomial population with parameter p, then the sampling distribution of the sample proportion A :1} p = - n will have a mean p and standard deviation SE65) 2 m where q z 1 — p. n When the sample size n is large, the sampling distribution of p can be approximated by a normal distribution. The approximation will be adequate if np > 5 and nq > 5. Example 10.. In a survey, 500 mothers and fathers were asked about the importance of sports for boys and girls. Of the parents interviewed, 60% agree that the genders are equal and should have equal opportunities to participate in sports. Describe the sampling distribution of the sample proportion fiof parents who agree that the genders are equal and should have equal opportunities. Solution. Let p denote the population proportion of all parents in the United States who agree that the genders are equal and should have equal opportunities. The sampling 8 distribution of I? can be approximated by a normal distribution, with mean equal to p and standard error SEQ/5): % whereqzl—p. It should be noted that the sampling distribution of if is centered over its mean p. Even though we do not know the exact value of p (the sample proportion 13 : 0 .60 may be larger or smaller than p), an approximate value for the standard deviation of the sampling distribution can be found using the sample proportion 1’5: 060 to approximate the unknown value of p. Thus, SE = — m —— = ———— = 0.022. (p) \/ n V n 500 Now the probability the 13 will fall within 28E (13) = 0.044 is given by p — p 0.044 P{ SE0?) < z P{lz[ < 2} P{—2<z<2}=P{z<2}—P{z<—2} 0.9772 —- 0.0228 0.9544. A P {lfi— pl < 0044} Therefore, approximately 95% of the time 13 will fall within 28E (f5) = 0.044 of the (un— known) value of 1)... Example 11. Refer to Example 10. Suppose the proportion p of parents in the popu— lation is actually equal to 0.55. What is the probability of observing a sample proportion larger than or equal to the observed value 1? = 0.60? Solution. Since n = 500 and if = 0.60, we calculate SE (2’5) = 0qu = —~—(0'555)0(00'45) = 0.0222. The required probability is Pfifiz 0.60} % P{z 2 2.25} = 1 — P{z g 2.25} = 1 — 0.9878 = 0.0122, where the standardized value of 0.60 is 0.60 — 0.55 0.0222 That is, if we were to select a random sample of n = 500 observations from a population with proportion p equal to 0.55, the probability that the sample proportion 13‘ would be larger than or equal to 0.60 is only 0122.. = 2.25.. HOME‘W‘ORK: pp279 — 281 7.37, 7.41, 7.43, 7.45, 7.47 ...
View Full Document

Page1 / 9

stats 363 Chapter-7 notes - CHAPTER 7 SAlVIPLING...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online