systematic-sampling

systematic-sampling - CASE was 90% coppez'. Is the standard...

Info iconThis preview shows pages 1–14. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 10
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 12
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 14
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CASE was 90% coppez'. Is the standard bein fer wm help us answer the question. DEFINITION 7.1 A sample obtained by randomly selecting one element from the first It elements in the frame and every kth element thereafter is called a I-in-k systematic sam [c with a random start. m As in previous chapters, we present methods for estimating a population mean, total, and proportion. We also discuss appropriate bounds on the error of estimation and samplewsize requirements. Systematic sampling provides a useful alternative to simple random sampling for the following reasons: 1. _Systematic sampling is easier to erfonu in thnfield and hence is less subject to selection errors by field—workers than are either simple random samples or strat— ified random samples, especially if a good frame is not available. 2. Systematic sampling can provide greater information per unit cost than simple . ....__. _ mm. __.. _ _. W random sampling. can provide for populations with certain patterns in the Wments. In general, systematic sampling involves random selection of one element from the first It elements and then selection of every kth element thereafter. This procedure is easier to perform and usually less subject to interviewer error than is simple random sampling. Fer example, usiWWifl shoppers on a city street dinner would be difficu t. {he interviewer could not deter— Wse W po plflatimisizeN would not be known until all shoppers had passed the corner. In contrast, the interviewer could take a systematic sample (say, 1 in 20 shoppers) undiE-Wafipfisfizgasph pings. mfimcedure Would be an easy-one for even an inexperienced interviewer. In addition to being easier to perform and less subject to intervieWer error, sys— tematic sampling frequently provides more information per unit cost than does simple random sampling. A systematic sample is generally 5 read more 11 ' y over the entire population aWMM- tidfi’tfian an equivalent amount of data contained in a simple random sampr Con— nan thdfoflowingilluslration. We wish to select a Lin-5 systematic sample of travel vouchers from a stack of N 2 1000 (that is, sample n m 200 vouchers) to determine the proportion of vouchers filed incorrectly. A voucher is drawn at random from the first five vouchers {for example, number 3}, and every fifth voucher thereafter is iu~ cluded in the sample. Suppose that most of the first 500 vouchers have been correctly filed, but that be~ cause of a change in clerks the second 500 have all been incorrectly filed. Simple ran— dom sampling could accidentally select a large number (perhaps all) of the 200 vouchers from either the first or the second 500 vouchers and hence yield a very poor estimate ofp, In contrast, systematic sampling would select an equal number of vouchers from each of the two groups and would give a very accurate estimate of the proportion of vouchers incorrectly filed. Additional examples are discussed in Section 7.3 to illustrate how to choose be— tween systematic and simple random sampling in a given situation. Note, hOWever, that the accuracy of estimates from systematic sampling depend on the order of the sampling units in the-frame. If the incorrect vouchers are randomly dispersed among all vouchers, then the advantage of systematic sampling is lost. Systematic sampling is very commonly used in a wide variety of contexts. The US. Census directs only a minimal number of questions to every resident, but it gathers much more information from a systematic sample of all residentsnln the 2000 census, the long form of the census questionnaire was distributed to, a I roxi— mateiy, a Lin—6 systematic sample of residents. . - WWWWs by listing election districts in the United States and then systematically selecting 300 or so for a follow—up study of households. The households, or dwellings, Within a sampled district may again be se— lected systematically—why choosing the second dwelling in every other block when moving east to west, for example. 7.. J. mousmal quality control sampling plans are most often systematic in structure. An ins ection lan for manufactured items movin alon an assembly line ma fl call for inspecting eveg 50th item. An inspection of cartons of products stored in a ware- house may suggest sampling the ascend carton from the left in the third row down from the top in every fifth stack. In the inspection of work done at fixed stations, the inspection plan may call for walking up and down the rows of workstations and in- specting the machinery at every tenth station. The time of day is often important in assossing quality of worker performance, and so an inspection plan may call for sam— plin g the output of a workstation at systematically selected times throughout the day. Auditors are frequently confronted with the problem of sampling a list of ac~ counts to check compliance with accounting procedures or to verify dollar amounts. The most natural way to sample these lists is to choose accounts systematically. Market researchers and o inion pollsters who sample pen is on the move very often —mm.mw employ a systematic design. Every 20th customer at at checkout counter may be board— ‘Wmsewicd Every 100th car enterw im on various advertising policies of the park or on ticket prices. All of these samples are systematic samples. _W Crop—yield estimates often result from systematic samples of fields and small plots within fields. Similarly, foresters may systematically sample field plots to esti— mate the proportion of diseased trees or may systematically sample the trees them~ selves to study growth patterns. Thus, systematic sampling is a popular design. We next investigate the construc- tion of these designs and the properties of resulting estimators of means, totals, and proportions. 7.2 How to Draw a Systematic Sample Although simple random sampling and systematic sampling both provide useful al~ tematives to one another, the methods of selecting the sample data are different. A simple random sample from a population is selected by using a table of random numbers, as noted in Section 4.2. In contrast, various methods are possible in sys- tematic sampling. The investigator can select a 1—in-3, a 1-in-5, or, in general, a l—in— k systematic sample. Eomnmplnaimndicalmhgamr is interested in obtaining - imwmmwmeciahsts prescribed a cer— twmlv = 15,099)_.JT0 obtain a simple random sample of n a: 1600 specialists, we use the methods in Section 4.2 and refer to a table of ran- dom numbers; howover, this procedure requires a great deal of work. Alternatively, “’3 299W s ecialist) at random from the flISt k = 9 pmsm, ing on the list and then select every ninth name thereafter until a sample of size £600 is selected. This Sample is @lflnl-in—Q systematic sample. ‘ Perhaps you wonder how 15 is chosen in a given situation. If the population size N is known, we can determine an approximate sample size n for the survey (see Sec- tion 7.5) and then choose k to achieve that sample size. There are N m 15,000 spe— cialists in the population for the medical survey. Suppose the required sample size is n = 100. We must then choose it to be 150 or less. For k = 150, we will obtain exactly 12 = 100 observations, whereas for k < 150, the sample size will be greater than 100. In general, for a systematic sample of n elements from a population of size N, k must be less than or equal to N / n (that is, k 5 N f it). Note in the preceding illustra- tion that k fl5,000/100, that is, k g 150. We cannot accurately choose k when the population size is unknown. We can determine an approximate sampie size it, but we must guess the value of It needed tp ac teve a sample of size n. If too large a value of k is shown, the required sampie size n will not e obtained by using a 1minwk systematic sample from the population. This result presents no problem if the experimenter can return to the population and conduct another luinwk systematic sample thegetpuired sampie size is obtained. W .n. However, in some situatipns, obtaining a second systematic sample is impossible. For example, conduetin g another Lin-20 systematic sample of shoppers is impossi— bie if the required sample of n m 50 shoppers is not obtained at the time they pass the corner. ““”W"*WM Mm W”- 7.3 , Estimation of a Population Mean and Total As we have repeatedly stressed, the objective of most 3am to sutveys is to estimate one or more population arameters. We can estimate a population mean u, from a sys tematic sample by using the sample mean 37. This outcome is shown in Equation (7.1). Estimator of the popuiation mean p: )1 M 1 n A — 1 M=Ysy= where the subscript sy signifies that systematic sampling was used. Estimated variance cry”: i .92 N m n n N assuming a randomly ordered population. You will recognize that the estimated van'zince of y. given in E . 7.2 is _‘ enti- cal to the estimated variance of )7 obtained by using simple random sampling (Sec- tion 4.3). This result does not impiy that the true variance of jig), is the same gthat of y. The variance of y is giVen by or?" N — n V ' 2 ~— 7. (J’) n ( N _ 1) ( 3) Similarly, the variance of as], is given by ‘ 2 _ or V(}’sy) = ~n—E1 + (n — Up] (7.4) , where p is a measure of the corteiation between pairs of elements Within the same ' systematic sample. (This is discussed in more detail in Section 7.7.) if ,9 is close to 1, ItWW teristic being measured, and systematic sampling wit} yield a higher vatianceflofitheaN WWmehngfif W16 samp mg may e more precise than sini 1e random sampling. The correlation me be (Note that p cannot be so iarge a negative that the variance expression becomes neg- ative.) For p close to 0 and N fairly large, systematic sampling is roughly equivalent to simple random sampling. An unbiased estimate of V(j§sy) cannot be obtained by using the data from Only one systematic sample. When systematic sampling is nearly equivalent to simple ran— dom sampling; we can estimate VGISY) by the estimated variance from simple ““HWHA random sampiing. In other situations, the simple random sampling variance formula can provide a useful upper‘or lower bound to the true variance from systematic sam— pling. To provide more detail on how these approximations work we consider here the foliowing three types of popuiations: a random population, ordered population. and periodic population. DEFINITION 7.2 A population is random if the elements of the population are in random order. Apopulation is Ordered if the elements of a population have values that trend upward or downward when they are listed. A population is periodic if the elements of a population have values that tend to cycle upward and downward in a regular pattern when listed. Figures 7.1 through 7.3 provide examples of these population types. FIGURE 7.1 Random population elements 20 40 60 80 100 Element number FIGURE 7.2 Ordered population elements 200 160 120 80 4G 20 40 60 80 £00 Eiement number FIGURE 7.3 40 60 80 100- Element number Arandom population may occur in an alphabetical listing of student grades on an exam, because there is e erall no reaso 'nnin of the al- habet should have lower or hi her ‘ades than those at the end (unless students hapu pen to be seated alphabetically in the room). An ordered 0 ulation sometimes oc~ curs in chronological listings, such as a bank’s listin of outstandin mort a e balances. The 0 er mo gages Will tend to have smaller balances than the newer ones. A eriodic opulation may occur in die avera e dail sales volume for a chain of grocery stores. The population of daily sales is generally cyclical, with peak sales occurring toward the end of each week. A systematic sample for a random 320me like a simple random sam le, So, in that case, the variance approximation opingme ‘Wsnnlmigfi‘sam ling works vveiiA IW~ ulation, the samfle vmuesmimwkgumegcalfitflmma simple random sample, making the withinwsample correlation , ne ative. Envision??ng a systematic sarn is for the data of Figure 7.2; each sam le will have some of the smaller values as well as some of the larger values, which would not necessarily hap— pen in a simple random sample. This 1m lies that the s stematic sampling mean will have a smaller variance than the one for sim 1e random sampling, so that the use of m sane ling WM M7301: a periodic population, the effectiveness of a 1—inwk sample depends on the value we choose for k. If we sample daily sales every Wednesday, we will probably undereW average daily sales volume. Similarly, if we sample sales every Friday, we will probably overestimate the true average sales. We might sam- ple every ninth workday to avoid consistently sampling either the low— or hi gh-sales days. Sampling every Wednesday (or Friday) tends to produce samples that have values nearly alike and hence a positive within sample correlation. This makes the variance of a systematic sample larger than that of a corresponding simple random sample, and the use of the simple random sampling variance formula will produce an underestimate of the true sampling error. Choosing a systematic sampie that hits both the peaks and valleys of a cyclical trend will bring the method more in line with a simple random sample and allow the use of the simple random sample variance formula as a reasonable approximation. To avoid the problem of underestimating the variation, which often occurs with systematic sampling from a periodic population, the investigator could change the random starting point several times. This procedure would reduce the possibility of choosing observations from the same relative position in a period population. For ex- ample, when a l—inwlO systematic sample is being drawn from a long list of file cards, a card is randomly selected from the first 10 cards (for example, card 2) and every tenth card thereafter. This procedure can be altered by randomly selecting a card from the first £0 (for example, card 2) and every tenth card thereafter for perhaps 15 selections to obtain the numbers 2,12,22,...,152 Another random starting point can be selected from the HEW W 153,154,155,...,162 If 156 is selected, we then proceed to select every tenth number thereafter fothe We This entire process is repeated until the desired sample size has We process of selecting a random starting point several times throughout the systematic sample has the effect of shuffling the elements of the pop— ulation and then drawing a systematic sample. Hence, we can assume that the sam» ple obtained is equivalent to a systematic sample drawn from a random population. The variance of 375), can then be approximated by using the results from simple ran— dom sampling. Alternatives to this approach are given in Sections 7.6 and 7.7. EXAMPLE 7.1 The federal government keeps track of various indicators on the performance of in- dustries in the country by collecting annual data on variables such as the number of employees and payroll. The Standard Industrial Classification (SIC) system drv1des the manufacturing industry into 140 groups. Table 7.i shows data cit-number of em— pioyees (in thousands) for 2000 and 2001 and mean annual salary (in thousands of dollars) for 2001 for a sample of 20 industrial groups. The sample was selected sys- tematically from the list of the 140 groups appearing in the Statistical Abstract for the United States (see http://www.bls.gov/oes/ZOOiloessrcihtrn). (a) Estimate the mean number of employees per manufacturing SIC group, and find a margin of error for your estimate. (b) Estimate the mean loss of employees between 2000 and 2001 per SIC manufacturing group, and find a margin of error for your estimate. SOLUTION Because all statistical analyses should begin with a piot of the data, let’s first look at the plots of employees ordered by sample number (Figure 7.4) and loss of employ— ees by sample number (Figure 7.5). There is little in the way of a pronounced pattern here except for the fact that the larger industries tend to come toward the end of the list. In fact, the SIC list does have the more lucrative electronics, transportation, and medical equipment manufacturing industries close to the end of the list. This is a good situation for systematic sampling, becauSe a simple random sample could have missed the bottom end of the hat completely. The pattern in loss of employee data across the sampled values is more balanced, with some large losses coming at both ends of the list. Again, this could be advantageous for systematic sampling because it seems to cover a broad range of loss values. (A simple random sample could have sampled all industries from the middle of the list.) TABLE 7.1 Employee and salary data for a sample of manufacturing industries WWW—WWW 2000 2001 2001 mean employees employees salary Sample SIC Description (thousands) (thousands) (thousands) WWW l 204 Grain mill products 122.4 122.2 _ 34.9 2 212 Cigars 2.9 3.2 26.9 3 225 Knitting mills 120.1 98.6 25.0 4 233 Women’s, misses’, and juniors’ 169.9 137.3 23.0 ‘ outerwear 5 241 Logging 78.2 73.6 29.8 6 252 Office furniture 80.4 69.2 32.5 7 265 Paperboard containers and boxes 219.4 207.2 32.8 8 276 Manifold busihess forms 7 42.0 36.5 33.5 9 284 Soap, detergents, and cleaning ' 156.0 149.2 37.8 preparations; perfumes, cosmetics, and other toilet preparations 10 299 Miscellaneous products of petroleum 13.2 14.1 41.9 and coal 11 313 Boot and shoe cut stock and findings 1.1 0.8 26.1 12 322 Glass and giassware, pressed or blown 67.6 60.0 32.9 13 329 Abrasive, asbestos, and miscellaneous 74.0 67 .i 34.4 14 339 Miscellaneous primary metal products 26.8 25.4 35.7 15 347 Coating, engraving, and allied services 149.6 128.5 29.5 16 355 Special industry machinery 170.9 146.4 42.1 17 363 Household appliances 106.3 104.8 30.6 I8 372 Aircraft and parts 466.6 450.5 49.5 19 382 Laboratory apparatus and anaiyticai, 311.4 282.4 46.1 optical, measuring, and controlling instruments 20 394 Dolls, toys, games and sporting 101.0 90.7 31.2 and athletic WWWWWHWWWWWWW n Mean Median Standard deviation M 2001 employees 20 H34 946 105.6 2000—2001 empioyees 20 10.61 7.25 10.29 “WWW From the statistical summaries given in Table 7.1 and using the standard formsw las for simple random sampling, the analysis for the mean number of employees proceeds as follows: 573:, m 113.4 A 140 w 20 1 may) = (gamma? , 140—20 1 /V — .. We. ._ 5.6 w»— 43.72 2 my) 2 ( 140 > (20)“ ) FIGURE 7.4 Employees for 2001 by sampie number 1;! C} O 2001 employees {0 C! O 100 4 8 12 16 20 Sample member FEGURE 7.5 Loss of employees by sample number, 20004001 Loss 4 8 12 ' 16 20 Sample number Thus, the estimated mean number of employees pet industry is approximately 113.4 thousand, give or take approximately 44 thousand. Similar caleuiations on the loss of employees yield an estimated mean of 10.61 thousand with a margin of error of approximately 4.26 thousand. This is a fairly large loss of employees from manufacturing in 1 year, but the margin of error is also laige due to the rather small sample and the large amount of variability in the employee data. m Recall that estimation of a poPulation total requires knowledge of the total num— ber of elements N in the population when we are using the procedures in Chapters 4 and 5. For example, we use f=N37 as an estimator of r from simple random sampling. Similarly, We need to know N to es- timate 1: when we are using systematic sampling, as expressed in Eqs. {7.5) and {7.6). Estimator of the population total 1': mejlsy Estimated variance of r: S tom.) 2 Wit/(rs) : NZ( assuming a randomly ordered population. Note that the results presented in Eqs. (7.5) and (7.6) are identical to those presented for estimating a population total under simple random sampling. This result does not imply that the true variance of N 37,), is the same as the variance of N 9. Again, we cannot obtain an unbiased estimator of V(Nj5y) from the data in a single systematic sample. However, in certain circumstances, as noted earlier, systematic sampling is equivalent to simple random sampling, and we can use the result presented in Sec— tion 4.3. EXAMPLE 7.2. Returning to the data from the systematic sample of 20 industry groups from the pop- ulation of 140, shown in Example 7.1, it is now of interest to estimate the total num— ber of employees lost lay the manufacturing segment of U.S. industry between 2000 and 2001. From the data provided, estimate this total and finda bound for the error of estimation. SOLUTION The estimated mean loss was 10.61 thousand with a margin of error of approximately 4.2.6 thousand. The estimate of the total. simply multiplies these quantities by N = 140. Thus, the estimated total number of employees lost is 1485 thousand with a bound on the error of estimation amounting to 596 thousand. This bound is quite large. Again, We are attempting to estimate a total from highly variable data with a small sample; the precision of the result is not great in this case. To achieve greater precision, the sample size should be increased or the sampling design changed, or both. ‘ If stratifying the populations is advantageous, systematic sampling can be used within each stratum in place of simple random sampling. Using the estimator Eq. (7.1) with its estimated variance (7.2) within each stratum, the resulting estima— tor of the population mean will look similar to Eq. (5.1), with an estimated variance giVen by Eq. (5.2). Such a situation might arise if We Were to stratify an industry by plants and then take a systematic sample of the records within each plant to estimate average accounts receivable, time lost to accidents, and so on. 7.4- Estimation of a Population Proportion An investigator frequently wishes to use data from a systematic sample to estimate a population proportion. For example, to determine the proportion of registered voters in favor of an upcoming bond issue, the investigator might use a twin-k systematic sample from the voter registration list. The estimator of the population proportion p obtained from systematic sampling is denoted by 135,. As in the simple random sam- pling (Section 4.5), the properties of 1353, parallel those of the sample mean is), if the response measurements are defined as follows. Let y,- m 0 if the ith element sampled does not possess the specified characteristic and y,- x: 1 if it does. The estimator {3,5, is then the average of the 0 and 1 values from the samnle Estimator of the population proportion p: where as}, = 1 —— fisy, assuming a randomly ordered population. The fpc, (N —— n) / N , in Eq. (7.8) can be ignored if the population size N is unknown, but can be assumed large relative to n. Again, note that the estimated variance of £353, is identical to the estimated variance of i; using simple random sampling (Sec— tion 4.5). This result does not imply that the corresponding population variances are equal; however, if N is large and if the obsorvations within a systematic sample are unrelated (that is, p = G), the two population variances will be equal. EXAMPLE 7.3 A l-in~6 systematic sample is obtained from a voter registration list to estimate the proportion of voters in favor of the proposed bond issue. Several different random starting points are used to ensure that the results of the sample are not affected by pe- riodic Variation in the population. The coded results of this preeiection survey are as shown in the accompanying table. Estimate p, the proportion of the 5775 registered voters in favor of the proposed bond issue (N m 5775). Place a bound on the error of estimation. SOLUTION The sample proportion is given by Voter Response 962 . 4 1 A ~f:g”_652m0678 10 o Pit—962”§B§”‘ 16 1 Because N is large and several random starting points were chosen in drawing the 57.50 0 systematic sample, we can assume that V 5766 ' O - n ' ' 5772 l A A PsyQSy N W n V m 952 (Psy) n_1( N ) Eyi=652 {:1 provides a good estimate of V{ 133,). The bound on the error of estimation is a A A " N—n zx/thsybz Psyqsy( N ) nwl (0.678) (0.322) 5775 — 962 \/ 961 ( 5775 0 028 Thus we estimate 0.678 (67.8%) of the registered voters favor the proposed bond issue. We are relatively confident that the error of estimation is less than 0.028 7. 5 (2.3%). e Selecting the Sample Size Now let us determine the number of observations necessary to estimate ,u. to within B units. The required sample size is found by solving the following equation for n: 2,/V(y5y) m s (7.9) EXAMPLE 7.4 SOLUTION The solution to Eq. (7.9) involves both 0‘2 and p, which must be known (at least ap— proximately) in order to solve for 14. Although these parameters sometimes can he es» timated if data from a prior survey are available, we do not discuss this method in this book. Instead, we use the formula for n for simple random sampling. This formula could give an extra—large sample for ordered populations and too small a sample for periodic populations. As noted earlier, the variances of 373,, and 37 are equivalent if the population is random. Sample size required to estimate ,1: with a bound B on the error of estimation: ” z (N it: + 02 (7'19) where 82 D = “2r The management of a large utility company is interested in the average amount of time delinquent bills are overdue. A systematic sample will be drawn from an alpha— betical list of N z 2500 overdue customer accounts. In a similar survey conducted the previous year, the sampie variance was found to be 32 r: 100 days. Determine the sample size required to estimate it, the average amount of time utility “bills are over- due, with a bound on the error of estimation of B m 2 days. A reasonable assumption is that the pepuiation is random; hence, p w 0. Then we can use Eq. (7.10) to find the approximate sample size. Replacing oz by 32 and. setting B2 4 Dm-E—__ we have _ N02 W 2500000) .""" (N w on + 03 ‘“" 2499(1)+100 n = 96.19 Thus, management must sample approXimately 97 accounts to estimate the average amount of time delinquent bills are overdue, to within two days. a To determine the sample size required to estimate 1: with a bound on the error of estimation of magnitude B, We use the corresponding method presented in Section 4.4. The sample size required to estimate p to within B units is found by using the sample size formula for estimating p under simple random sampling. Sampie size required to estimate p. with a bound B on the error of estimation: Npq n :- ———-——-—w-»m~m (7.11) (N —— 1)D + p4 where 32 qzlwp and D227}— EXAMPLE 7.5 SOLUTION 7.6 In a practical situation, we do not know p. We can find an approximate sample size by replacing p with an estimated value. If no prior information is available to estim mate p, we can obtain a conservative sample size by setting p = 0.5. An advertising firm is starting a promotional campaign for a new product. The firm wants to sample potential customers in a small community to determine customer ac~ ceptance. To eliminate some of the costs assooiated with personal interviews, the investigators decide to run a systematic sampte from N 2 5000 names listed in a com— munity registry and collect the data via telephone interviews. Determine the sample size required to estimate p, the proportion of people who consider the product “ac- ceptable,” with a bound on the error ofestirnation of magnitude B = 0.03 (that is, 3%) The required sample size can be found by using Eq. (7.11). Although no previous data are available on this new product, we can still find an approximate sample size. Set p m 0.5 in Eq. (7.11) and 132 (0.03)2 D m 7;— ._ 4 m 0.000225 Then the required sample size is N pq 5000(05) (0.5) ” m (N ~ on + pg — 4999(0000225) + (0.5)(05) : 909240 , Hence, the firm must interview 910 people to determine consumer acceptance to within 3%. Repeated Systematic Sampling We have stated in Section 7.3 that we cannot estimate the variance of n, from infor~ mation contained in a single systematic sampie unless the systematic sampling gen- erates, for all practical pmposes, a random sample. When this result occurs, we can use the random sampling estimation procedures outlined in Section 4.3. However, in most cases, systematic random sampling is not equivalent to simple random sampling. An alternate method must be used to estimate V( )7”). Repeated systematic sampling is one such method. As the name implies, repeated systematic sampling requires the selection of more than one systematic sample. For example, ten l-in-SO systematic samples, each con— taining six measurements, could be acquired in approximately the same time as one 1-in—5 systematic sample containing 60 measurements. Both procedures yield 60 measurements for estimating the population mean it, but the repeated sampling pro— cedure allows us to estimate Wysy) by using the square of the deviations of the n, m 10 individual sample means about their mean. The average of the ten sample means, i1, will estimate the population mean pt. To Select nS repeated systematic samples, we must space the elements of each sample further apart. Thus, ten 1—in~50 samples (125 = 10, k’ = 50) of six measure— ments each centain the same number of measurements as does a single l-in~5 sam» ple (ic’ = 5) containing n = 60 measurements. The starting point for each of the It, systematic samples is randomly selected from the first ements in each sample are acquired by adding k’, 2k’, and so forth to the starting lements. The remaining el— point until the total number per sample, n / n5, is obtained. A population consists of N m 960 elements, which we can number consecu- tively. To select a systematic sample of size n = 60, we choose k = N / n m 16 and a random number between i and 16 as a starting point. What procedure do we follow to select ten repeated systematic samples in place of the one systematic sample? First, we choose k’ 2: 10k 2 10(16) m 160. Next, we select ten random numbers he~ tween 1 and 160. Finally, the constant 160 is added to each of these random starting points to obtain ten numbers between 161 and 320; the process of adrhng the constant is continued until ten samples m" air». (—1 ma nine-:4an “vrww~r_... _.... “-mu u with; vumle-Ll- Arandorn selection of ten integers between 1 and 160 gives the following: 73,42,81,145,6,21,86,1’7,112,102 These numbers form the random starting points for ten systematic samples, as shown in Table 7.2. The second element in each sample is found by adding 160 to the first, the third by adding 160 to the second, and so forth. TABLE 7.2 _ Selection of repeated systematic samples Random Second Third Sixth starting element element element point in sample in sample in sample 6 166 326 ... 806 17 1'77 337 . . . 817 21 181 341 . . . 821 42 202 362 . .. 842 73 233 393 . . . 87 3 81 241 401 . .. 881 36 246 406 . .. 886 102 262 422 . .. 902 112 2'72 432 . . . 912 145 305 465 . .. 945 We frequently select ms to be at least 10 to allow us to obtain enough sample means to acquire a satisfactory estimate of 1701). We choose k’ to give the same number of measurements as would be obtained in a single l~in~k systematic sample; thus, k’ = kns The formulas for estimating tt from 123 systematic samples are show; in Eqs. {7.12) and (7.13). Estimator of the population mean n, using n5 iuin—k’ systematic samples: where 37; represents the mean of the ith systematic sample. Estimated variance of i2: We can also use repeated systematic sampling to estimate a population total I, if N is known. The necessary formulas are given in Eqs. (714) and {7.15). ' Estimator of the population total I using ns Lin-15 systematic samples: tmNa=NZ33 (7.14) a (7(a) u set/(a) e N2 (N 7") 3i?— (7.15) EXAMPLE 7.6 A state park charges admission by earload rather than by person, and a park official wants to estimate the average number of people per car for a particular summer holm iday. She knows from past eitperienee that there should be approximately 400 cars entering the park, and she wants to sample 80 cars. To obtain an estimate of the ’FABLE 7.3 Data on number of peopie per ear” Random starting Second Thirci Fourth Fifth Sixth Seventh Eighth point element element element element eiement element element 53; 2(3) 52(4) 102(5) 152(3) 202(6) 252(1) 302(4) 352(4) 3.75 5(5) 55(3) 105(4) 155(2) 205(4) 255(2) 305(3) 355(4) 3.38 7(2) 57(4) 107(6) 157(2) 207(3) 257(2) 307(1) 357(3) 2.88 13(6) 63(4) 113(6) 163(7) 213(2) 263(3) 313(2) 363 (7) 4.62 26(4) 76(5) 126(7) 176(4) 226(2) 276(6) 326(2) 376(6) 4.50 31(7) 81(6) 131(4) 181(4) 231(3) 281(6) 331(7) 381(5) 5.25 35(3) 85(3) 135(2) 185(3) 235(6) 285(5) 335(6) 385(8) 4.50 40(2) 90(6) 140(2) 190(5) 240(5) 290(4) 340(4) 390(5) 4.12 45(2) 95(6) 145(3) 1 95(6) 245(4) 295(4) 345(5) 395(4) 4.25 46(6) 96(5) 146(4) 196(6) 246(3) 296(3) 346(5) 396(3) 4.38 *The tesponsns y; are in parentheses. variance, she uses repeated systematic sampling with ten samples of eight cars each. Using the data given in Table 7.3, estimate the average number of people per car and place a bound on the error of estimation. SOLUTION For one systematic sample, Hence, for as = 10 samples, k’ 2 10k 2 10(5) :2 50 The following ten random numbers between 1 and 50 are drawn: 13, 35, 2, 40. 26. 7, 31, 45, 5, 46 Cars with these numbers form the ranciom starting yolnts for the systematic samples. For Table 7.3, the quantity 51 is the average for the first row, )7; is the average for the second row, and so fortt1.The estimate of a is A 1 “mm” “5 .E’ 1 37.- : €66.75 + 3.38 + -- - +4.38) 2 4.16 'P’l H t with s); 2 0.675. Thus, the estimated standard error of ,0 is r”. A /Nw~n /400—80 Therefore our best estimate of the mean number of people per‘car is 4.16 plus or minus approximately 0.38. m ...
View Full Document

This note was uploaded on 06/06/2011 for the course STAT 4260 taught by Professor Staff during the Spring '11 term at UGA.

Page1 / 14

systematic-sampling - CASE was 90% coppez'. Is the standard...

This preview shows document pages 1 - 14. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online