Chapter3

Chapter3 - Statistics allows us to look at our data in...

Info iconThis preview shows pages 1–20. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 10
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 12
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 14
Background image of page 15

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 16
Background image of page 17

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 18
Background image of page 19

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 20
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistics allows us to look at our data in different ways and make objective and intelligent decisions regarding their quality and use. 34 CHAPTER STATISTICAL EVALUATION OF DATA We haVe shown that the magnitude of the indeterminate error associ- ated with an individual measurement is determined by a chance combina- tion of tiny individual errors, each of which may be positive or negative. Because chance is involved in this type of error, we can use the laws of statistics to extract information from experimental data.1 In this chapter we describe several important statistical procedures and show h0w they are used to estimate the magnitude of the indeterminate error in an anal- ysis. 3A THE STATISTICAL TREATMENT OF INDETERMINATE ERRORS Statistics is the mathematical science that deals with chance variations. We must emphasize at the outset that statistics only reveals information I that is already present in a data set. That is, no new information is created ', by statistiCs. Statistical treatment of a data set does, however, allow us to make objective judgments concerning the validity of results that are oth- erwise difficult to make. 3A—1 The Population and the Sample In order to use statistics to treat our data, we must assume that the few replicate experimental results gathered in the laboratory are a tiny but representative fraction of an infinite number of results that could be col- lected if we had infinite time. Statisticians call this small set of data a sampie and view it as a subset of a population, or a universe, of data that IReferences on statistical applications include R. L. Anderson, Practical Statistics for Ana- lytt'cai Chemists. New York: Van NostranddReinhold, 1987; R. Calcutt and R. Boddy, Statistics for Analytical Chemists. New York: Chapman and Hall, 1983; J. Mandel, in Treatise on Armlytt‘cai Chemistry, 2nd cd., I. M. Kolthoff and P. J. Elving, Eds., Part I, Vol. 1, Chapter 5. New York: Wiley, 1978. 3A The Statistical Treatment of Indeterminate Errors 35 in principle exists. For example, the data in Table 2—3 make up a statisti- cal sample of an infinite population of pipet-calibration measurements that can be imagined (but not performed). The iaWS of statistiCs apply strictly to a p0pulation of data only. To use these laws, we must assume that the handful of data that make up the typical sample truly represents the infinite population of results. Unfortu- nately, there is no guarantee that this assumption is valid. As a result, statistical estimates about the magnitude of indeterminate errors are themselves subject to uncertainty and therefore can only be made in terms of probabilities. The Population Mean (a) and the Sample Mean (i) We will find it useful to differentiate between the sampie mean and the popaiatian mean. The sample mean is the mean of a limited sample drawn from a population of data. It is defined by Equation 2—1, when N is a small number. The populatioo mean, in contrast, is the true mean for the popu- lation. It is also defined by Equation 2—] when N approaches infinity. If the data are free of determinate error, the population mean is also the true value. T o emphasize the difference between the two means, the sample mean is symbolized by if and the population mean by a. More often than not, particularly when N is small, if differs from it because a small sample of data does not exactly represent its pepulation. The Sample Standard Deviation (s) and the Population Standard De- viation (a) We must also differentiate between the sample standard deviation and the popaiatiOn standard deviation. The sample standard deviation 5 was de- fined in Equation 2—2; that is, S = —NT ‘3'” In contrast, the population standard deviation 0', which is the true standard deviation, is given by (3-2) Note that this equation differs from Equation 3w1 in two ways. First, the population mean p. appears in the numerator of Equation 3—2 in place of the sample mean, E. Second, N replaces the number of degrees of free- dom (N — 1) that appears in EquatiOn 3—1. The reason the number of degrees of freedom must be used when N is small is as follows. When 0‘ is unknown, two quantities must be extracted from a set of data: .7? and 3. One degree of freedom is used to establish if Do not confuse the statistical sample with the analytical sample. Four ana- lytical samples analyzed in the labora- tory represent a single statistical sam- ple. This is an unfortunate duplication of the term “sample” that should cause no trouble once you are aware of its two meanings. Sample mean = .f, where N E x.- f = (:31; when N is small. Population mean = a, where N Z x.- n a [TV when N-—) an. In the absence of determinate error, the population mean a is the true value of a measured quantity. When N—r W,i—+uands—>cr. — 35 Chapter 3 Statistical Evaluation of Data Figure 3—1 Normal error curves. The standard deviation for curve B is twice that fOr curve A, that is, 0-3 : 20,1. {a} The abscissa is the deviation from the mean in the units of mea- surement. (b) The abscissa is the deviation from the mean in units of 0’. Thus, the two curves A and B are identical here. because, with their signs retained, the sum of the individual deviations must add up to zero. Thus, when N - l deviations have been computed, the final one is known. Consequently, only N — 1 deviations provide an independent measure of the precision of the set. 3A—2 Properties of the Normal Error Curve Figure 3—1a shows two Gaussian curves in which the relative frequency of occurrence of various deviations from the mean is plotted as a function of deviation from the mean (x - it). The two curves are for two popula— tions of data that differ only in standard deviation. The standard deviation for the population yielding the broader but lower curve (B) is twice that for the pepulation yielding curve A. Figure 3—lb shows another type of normal error curve in which the abscissa is now a new variable, 3, which is defined as 3 = — (34.) 0.4 0.3 P M Relative frequency F5 0.4 0.3 P N Relative frequency 9 {b} 3A The Statistical Treatment 01 Indeterminate Errors 37 Feature 3—1 WHY USE THE NUMBER OF DEGREES OF FREEDOM INSTEAD OF N? The effect of using the number of degrees of freedom for calculat- ing the standard deviation can be demonstrated by dividing the data in Table 2—3 into 10 samples of 5 data each, 5 samples of 10 data each, and 2 samples of 25 data each. When 0- and s are calculated for each sample using Equations 3~1 and 3—2, the results are Number ot Samples Mean {T of Samples Mean 3 of Samples and Size (Equation 3—2) (Equation 3—1) 10 samples of 5 0.0053 0.0059 5 samples of 10 0.0058 0.006] 2 samples of 25 0.0059 0.0061 1 sample of 50 0.0060 0.0060 A negative bias accompanies application of Equation 3a2 to small sets of data; this bias is reflected in the data of column 2. Note (in column 3) that the bias disappears when .r is calculated using Equa- tion 3—1. Note that z is the standard deviation from the mean of the data expressed in units ofstandard deviation. That is, when x — ,a = 0-, z is equal to one standard deviation; when x — p. = 20', z is equal to two standard devia— tions; and so forth. Since 2. is the deviation of the mean in standard deviation units, a plot of relative frequency versus this parameter yields a single Gaussian curve that describes all populatious of data regardless of standard deviation. Thus, Figure 3—1b is the normal error curve for both sets of data used to plot curves A and B in Figure 3—1a. The normal error curve has several general properties. (1) The mean occurs at the central point of maximum frequency. (2) There is a symmet- rical distribution of pesitive and negative deviations about the maximum. (3) There is an exponential decrease in frequency as the magnitude of the deviations increases. Thus, small indeterminate uncertainties are ob- served much more often than very large ones. Areas Under a Normal Error Curve It can be shown that 68.3% of the area beneath any normal error curve lies within one standard deviation (:10) of the mean n. Thus, 68.3% of the data making up the population lie within these bounds. Furthermore, approximately 95.5% of all data are within :20 of the mean and 99.7% within :30: The vertical dashed lines show the areas bounded by :10, $20, and :30- in Figure 3—1b. Because of area relationships such as these, the standard deviation of a Population of data is a useful predictive tool. For example, we can say — 38 Chapter 3 When N) 20, s E 0'. Figure 3-2 Relative error in o- as a function of N- Statistical Evaluation of Data that the chances are 68.3 in 100 that the indeterminate uncertainty of any single measurement in a normal distribution is no more than :10: Simi- larly, the chances are 95.5 in 100 that the error is less than :20, and so forth. Standard Error of a Mean The figures on percentage distribution just quoted refer to the probable error for a single measurement. If a series of samples, each containing N data, are taken randomly from a population of data, the mean of each set will show less and less scatter as N increases. The standard deviation of each mean is known as the standard error of the mean and is given the symbol 0,... It can be shown that the standard error is inversely propor» tional to the square root of the number of data N used to calculate the 11103.11: arm = urn/JV (3—4) where sigma is defined by Equation 3—2. An analogous equation can be written for a sample standard deviation: sm = six/N (3—5) 3A—3 Properties of the Standard Deviation Effect of N on the Reliability of s Uncertainty in the calculated value of 5 decreases as N in Equation 3—1 increases. Figure 3+2 shows the error in o- as a function of N. When N is greater than about 20, s and 0" can be assumed to be identical for all practical purposes- For example, if the 50 measurements in Table 2—3 are divided into ten subgroups of 5 measurements each, the value of s varies widely from one subgrOup to another (0.0023 to 0.0079 mL) even though the average of the computed values of s is that of the entire set (0.0056 50 Relative error in .i'. % IO 3A The Statistical Treatment of Indeterminate Errors 39 mL). In contrast, the computed values of s for two subsets of 25 measure— ments each are nearly identical (0.0054 and 0.0058 mL). The rapid improvement in the reliability of s as N increases makes it feasible to obtain a good approximation of 0- when the method of meas— urement is not excessively time-consuming and when an adequate supply of sample is available. For example, ifthe pH of numerous solutions is to be measured in the course of an investigation, it is useful to evaluate s in a series of preliminary experiments. This measurement is simple, requiring only that a pair of rinsed and dried electrodes be immersed in the test solution. The voltage between the electrodes is proportional to pH. To determine s, 20 to 30 portions of a buffer solution of fixed pH can be measured with all steps of the procedure being followed exactly. Normally, it is safe to assume that the indeterminate error in this test is the same as that in subsequent measurements. The value of s calcu- lated from Equation 3~I is thus a valid and accurate measure of the theoretical 0-. Pooling Data to Improve the Reliability of s The foregoing procedure is not always practical for analyses that are time- consuming. In this situation, data from a series of samples accumulated over time can be pooled to provide an estimate of s that is superior to the value for any individual subset. Again, we must assUme the same sources of indeterminate error in all the samples. This assumption is usually valid if the samples have similar compositions and have been analyzed in ex- actly the same way. To obtain a pooled estimate of the standard deviation, spoolgd, deviations from the mean for each subset are squared; the squares of all subsets are then summed and divided by an appropriate number of degrees of free— dom, as shown in Equation 3—6. The pooled s is obtained by extracting the square root of the quotient. One degree of freedom is lost for each subset. Thus, the number of degrees of freedom for the pooled s is equal to the total number of measurements minus the number of subsets. (3-5) Where N] is the number of data in set 1, N2 is the number in set 2, and so forth. The term N5 is the number of data sets that are being pooled. Example 3L1 The mercury content in samples of seven fish taken from the Sacramento River was determined by a method based upon the absorption of radiation by gaseous elemental mercury. Calculate a pooled estimate of the stan- gard deviation for the method, based upon the first three columns of ata: — 40 Chapter 3 Statistical Evaluation of Data Sum of Number of Square of Samples Mean. Deviations Specimen Measured Hg Content, ppm ppm Hg from Mean 1 3 1.80, 1.58, 1.64 1.673 0.0258 2 4 0.96, 0.98, 1.02, 1.10 1.015 0.0115 3 2 3.13, 3.35 3.240 0.0242 4 6 2.06, 1.93, 2.12, 2.16, 2.018 0.0611 1.89. 1.95 5 4 0.5?, 0.58, 0.64, 0.49 0.570 0.0114 6 5 2.35, 2.44, 2.70, 2.48, 2.44 2.482 0.0685 7 4 1.11, 1.15, 1.22, 1.04 1.130 0.0170 N = 28 Sum of Squares = 0.2196 The values in the last two columns for sample 1 were computed as fol— lows: xi iixr‘ "‘ (Xi T ’32 1.80 0.127 0.016! 1.58 0.093 0.0086 L64 0.033 0.0011 5.02 Sum of squares = 0.0258 The other data in columns 4 and 5 were obtained similarly. Then 0.0258 + 0.0115 + 0.0242 + 0.0611 + 0.0114 + 0.0685 + 0.0170 SW“ 2 23 — r = 0.10 ppm Hg Note that one degree of freedom is lost for each of the seven samples. Because more than 20 degrees of freedom remain, however, the com- puted value of s can be considered a good approximation of 0'; that is, s —) o- = 0.10 ppm Hg. BB THE USES OF STATISTICS Experimentalists use statistical calculations to sharpen their judgment concerning the effects of indeterminate errors. The most common appli— cations of statistics to analytical chemistry include: 1. Defining the interval around the mean of a set within which the popula- tion mean can be expected to be found with a given probability. 2. Determining the number of replicate measurements required to assure SB The Uses of Statistics 41 (at a given probability) that an experimental mean falls within a prede— termined interval around the population mean. 3_ Deciding whether an outlying value in a set of replicate results should be retained or rejected in calculating the mean for the set. 4, Estimating the probability that two samples analyzed by the same method are significantly different in composition, that is, whether a differenCe in experimental results is likely to be a consequence of indeterminate error or a real composition difference. 5. Estimating the probability that there is a difference in precision be— tween two sets of data obtained by different workers or by different methods. 6. Defining and estimating detection limits. 7. Treating calibration data. We will examine each of these applications in the sections that follow. 38—1 Confidence Limits The exact value of the mean ,u. for a population of data can never be determined exactly because such a determination requires an infinite number of measurements. Statistical theory allows us to set limits around an experimentally determined mean f, however, and the true mean ,u. lies within these limits with a given degree of probability. These limits are called confidence limits, and the interval they define is known as the Confidence limits define an interval confidence intervaL around i that probably centains n. The size of the confidencs interval, which is derived from the sample stande deviation, depends on the certainty with which 3 is known. If there is reason to believe that .r is a good approximation of (I, then the confidence interval can be significantly narrower than if the estimate of s is based open only two or three measurements. The Confidence Interval When 3 Is a Good Approximation of (T Figure 3—3 shows a series offive normal error curves. In each, the relative frequency is plotted as a function of the quantity 2; (Equation 3—3), which is the deviation from the mean in units ofthe pepntarion standard devia- tion. The shaded area in each plot lies between the values of *z and +z that are indicated to the left and right of the curves, respectively. The number within the shaded area is the percentage of the total area under the curve that is included within the 2; values. For example, as shown in the top curve, 50% of the area under any Gaussian curve is located be— tween —0.67(r and +0.6?er. Proceeding downward, we see that 80% of the total area lies between —1.290- and + 1.290, and 90% lies between —-1.640 and +1.64o-. Relationships such as these allow us to define a range of values around a measurement within which the true mean is likely to lie with a certain probability. For example, we may assume that 90 times out of 100, the true mean, pt, will be within $1.640- of any measurement that We make. Here, the confidence tenet is 90% and the confidence interval is The confidence level is the probabil- iZO' = $1,640: ity expressed as a percent. I 42 Chapter 3 Statistical Evaluation of Data The confidence limits are the values above and below a measurement that bound its confidence interval. Confidence level 1 Relative frequency. yi'N 4n-30e20-1n 0 3? 5-. U C 3 3' C O .1 E 3 m . . . ' -4a—3cr-2crrlar 0 lo 20 3o 40 59:. Figure 3e3 Areas under a Gaussian curve for various values of i2. nfidence limits (CL) of a single We find a general expression for the co 3. Remember that z can take measurement by rearranging Equation 3— positive or negative values. Thus, CL for p. = x : zcr (3—?) Values for z at various confidence levels are found in Table 3—1. Example 3—2 i Calculate the 50% and 95% confidence limits for the first entry (1.80 ppm I Hg) in Example 3—1. In that example, we calculated 5 to be 0.10 ppm Hg and had sufficient data to assume 5 —+ or. From Table 3—1, we see that z = 0.67 and 1.96 for the two confidence levels. Thus, from Equation 3—? 50% CL for p. = 1.80 i 0.6? X 0.10 = 1.80 i 0.07 95% CL for ,U. = 1.80 11.96 X 0.10 = 1.80 i 0.20 From these calculations, we conclude that the chances are 50 in 100 that a, the population mean (and, in the absence ofdeterminate error, the true value), lies in the interval between 1.?3 and 1.8? ppm Hg. Further- more, there is a 95% chance that p. lies in the interval between 1.60 and 2.00 ppm Hg. Equation 34’ applies to the result of a single measurement. Application of Equation 3—4 shows that the confidence interval is decreased by W for the average of N replicate measurements. Thus, a more general form of Equation 3—? is z” (3—8) Example 3—3 Calculate the 50% and 95% confidence limits for the mean value (1.6? ppm Hg) for specimen 1 in Example 3—1. Again, 3 ~—) 0' = 0.10. For the three measurements, umxow V3 1.96x 0.10 V3 50% CL = 1.67 d: = 1.67 i 0.04 95%CL=1.6?: =1.67i0.11 Thus, the chances are 50 in 100 that the population mean is located in the interval between 1.63 and 1.71 ppm Hg and 95 in 100 that it lies betWeen 1.56 and 1.?8 ppm. Example 3—4 How many replicate measurements of Specimen 1 in Example 3—1 are needed to decrease the 95% confidence interval to :00? ppm Hg? The pooled value is a good estimate of J. For a confidence interval of 10.0? ppm Hg, then, substitution into Equation 3—8 leads to zs 1.96 X 0.10 + —————__ 0.07:: =_ W W W = : ——-——1'960>3?0'10 = :2.80 N = ($2.8)2 = 7.8 We conclude that eight measurements would provide a slightly better than 95% chance of the population mean lying within :01)? ppm of the experi- mental mean. Equation 3—8 tells us that the confidence interval for an analysis can be halved by carrying out four measurements. Sixteen measurements will narrow the interval by a factor of 4, and so on. We rapidly reach a point of diminishing returns in acquiring additional data. Ordinarily we take ad- vantage of the relatively large gain attained by averaging two to four measurements but can seldom afford the time required for additional increases in confidence. It is essential to keep in mind at all times that confidence intervals based on Equation 3-8 apply Only in the absence ofdeterminate errors and only rfwe can assume that 5 == 0-. The Confidence Limits When o' Is Unknown Often We are faced with limitations in time or amount of available sample that prevent us from accurately estimating 0'. Here, a single set of repli- cate measurements must provide not only a mean but also an estimate of precision. As indicated earlier, 5 calculated from a small set of data may be quite uncertain. Thus, confidence limits are necessarily broader when a good estimate of (r is not available. To account for the variability of s, we use the important statistical parameter t, which is defined in exactly the same way as z (Equation 3—3) except that here 3 is substituted for 0': 16-h S r = (3—9) Like 2: in Equation 3—3, 1 depends on the desired confidence level. It also depends on the number of degrees of freedom in the calculation of 5. 3B The Uses of Statistics 43 Table 3—1 CONFIDENCE LEVELS FOR VARIOUS VALUES 0F 2 Confidence Levels. % z 50 0.67 68 1.00 80 1.29 90 1.64 95 1.96 96 2.00 99 2.58 .99.? 3.00 99.9 3.29 Number of Relative Size Measurements of Confidence Averaged. N Interval l 1.00 2 0.7[ 3 0.58 4 0.50 5 0.45 6 0.41 10 0.32 The r statistic is often called Student's t. Student was me name used by W. S. Gossctt when he wrote the classic paper on t that appeared in Btomerrika, 1908, 6. l. Gossett was employed by the Guinness Brewery to analyze statis- tically the results of determinations of the alcohol content of their products. As a result of this work, he discovered the now—famous statistical treatment of small sets of data. To avoid the disclo— sure of any trade secrets of his em— ployer, Gossett published the paper under the name Student. 44 Chapter 3 Statistical Evaluation of Data ASN*I-—>m.r—+z. Table 3—2 VALUES 0F t FOR VARIOUS LEVELS OF PROBABILITY Degrees of Factor for Confidence Interval Freedom 80% 90% 95% 99% 99.9% 1 3.08 6.31 12.? 63.7 637 2 1.89 2.92 4.30 9.92 31.6 3 1.64 2.35 3.18 5.84 12.9 4 1.53 2.13 2.73 4.60 8.60 5 1.48 2.02 2.5? 4.03 6.86 6 1.44 1.94 2.45 3.?1 5.96 7 1.42 1.90 2.36 3.50 5.40 8 1.40 1.86 2.31 3.36 5.04 9 1.38 1.83 2.26 3.25 4.78 10 1.3? 1.81 2.23 3.17 4.59 11 1.36 1.80 2.20 3.11 4.44 12 1.36 1.28 2.18 3.06 4.32 13 1.35 1.7? 2.16 3.01 4.22 14 1.34 1.76 2.14 2.98 4.14 m 1.29 1.64 1.96 2.58 3.29 Table 3~2 provides values for r for a few degrees of freedom. More extensive tables are found in variours mathematical and statistical hand- books. Note that t—9 z (Table 3—1) as the number of degrees of freedom becomes infinite. The confidence limits for the mean f of N replicate measurements can be derived from t by an equation similar to Equation 3—8: CL for ,u. = f i -— (3—-‘I 0) Example 3—5 A chemist obtained the following data for the alcohol content of a sample of blood: % C2H50H: 0.084, 0.089, and 0.079. Calculate the 95% confi- dence limits for the mean assuming (a) no additional knowledge about the precision of the method and (b) it is known that s e» o' = 0.005% C2H50H on the basis of previous eXperience. (a) 2 x,- = 0.084 + 0.089 + 0.0?9 = 0.252 E x? = 0.007056 + 0.007921 + 0.006241 = 0.021218 0.021218 - (0.25303 5 : ——3—;1—— = 0.005093 CgHSOH Here, i = 0.25213 2 0.084. Table 3—2 indicates that I = 4.30 for two degrees of freedom and 95% confidence. Thus, 95%CLZE: ts -00341w _— . =0.084i0.012‘VCHOH V’N V3 0 2 5 Fr. BB The Uses of Statistics 45 (13) Because a good value of 0’ is available, 1.96 x 00050 : —__.__.— Vj = 0.084 : 0.006% C2H50H 95% CL = r: % = 0.084 Note that a sure knowledge of a- significantly decreases the confidence interval. 38—2 Rejection of Outliers When a set of data contains an outlying result that appears to differ excessively from the average, a decision must be made whether to retain or reject the result.2 The choice of criterion for the rejection of a sus— pected result has its perils. If we set a stringent standard that makes the rejection of a questiouable measurement difficult, We run the risk of re- taining results that are spurious and have an inordinate effect on the average of the data. If we set lenient limits on precision and thereby make Outliers are the result of gross mow the rejection of a result easy, we are liker to discard measurements that (Section 2C). rightfully belong in the set and thus introduce a bias to the data. It is an unfortunate fact that no universal rule can be invoked to settle the ques- tion of retention or rejection. Statistical Tests Several statistical procedures such as the Q test and the T” test have been deveIOped to provide criteria for rejection or retention of outliers. Such tests assume that the distribution of the pOpulation data is normal, or Gaussian. Unfortunately, this condition cannot be proved or disproved for samples consisting of many fewer than 50 results. Consequently, sta- tistical rules, which are perfectly reliable for normal distributions of data, should be used with extreme caution when applied to samples containing only a few data. J. Mandel, in discussing the treatment of small sets of data, writes, “Those who believe that they can discard observations with statistical sanction by using statistical rules for the rejection of outliers are simply deluding themselves.H3 Thus, statistical tests for rejection should be used only as aids to commou sense when small samples are involved. The 0 Test The Q test is a simple, widely used statistical test.“ In this test the abso- lute value of the difference between the questionable result x, and its 1!. Mandel, in Treatise on Analytical Chemistry, 2nd ed., I. M. Kolthoff and P. .l. Elving. Eda, Part I. Vol. I, pp. 282—289. New York: Wiley, 19?8. 31. Mandel, in Treatise on Analytical Chemistry, 2nd ed, 1. M. Kolthoff and P. J. Elving. Eds-1 Part1, Vol. 1, p. 282. New York: Wiley, 1978. ‘R. E. Dean and W. J. Dixon. Anal. Chem.. 1951, 23. 636. 46 Chapter 3 Statistical Evaluation of Data Table 3—3 CRITICAL VALUES FOR REJECTION QUOTIENT C” Om (Reject if Oexp > 05...) Number of 90% 95% 99% Observations Confidence Confidence Confidence 3 0.941 0.970 0.994 4 0.765 0.329 0.926 5 0.642 0.?10 0.821 6 0.560 0.625 0.740 7 0.507 0.568 0.680 8 0.468 0.526 0.634 9 0.43? 0.493 0.598 10 0.412 0.466 0.568 *chroduced from D. B. Rorabacher, Anal. Chem. 1991, 63, 139. By courtesy of the American Chemical Society. nearest neighbor x” is divided by the spread w of the entire set to give the quantity QC”: Qexp : qu — xnll'lrw This ratio is then compared with rejection values Qcm found in Table 3—3. If Quip is greater than Qm-l, the questionable result can be rejected with the indicated degree of confidence. Example 3—6 The analysis of a calcite sample yielded CaO percentages of 55.95, 56.00, 56.04, 56.08, and 56.23. The last value appears anomalous; should it be retained or rejected? The difference between 56.23 and 56.08 is 0.15%. The Spread (56.23 — 55.95) is 0.28%. Thus, 0.15 QB”, = O—fi = 0.54 For five measurements, QC,“ at the 90% confidence level is 0.642. Because 0.536 < 0.642, we must retain the value 56.23. The Tn Test In the American Society for Testing Materials (ASTM) 1",, test, the quan- tity T,I serves as the rejection criterion,5 where T :lxg—fl S 5For further discussion of this test, see J. Mandel, in Treatise on Analytical Chemistry, 2nd ed, 1. M. Kolthoff and P. J. Elving, Eds, Part 1, Vol. 1, pp. 283—285. New York: Wiley, 1978. BB The Uses of Statistics 47 Here, xq is the questionable result, and f and s are the mean and standard deviations of the entire set including the questionable result. Rejection is indicated if the calculated 1",, is greater than the critical values found in Table 341. Example 3—? Apply the ".1"n test to the data in Example 3—6. 2x,- = 55.95 + 56.00 + 56.04 + 56.08 + 56.23 = 280.3 2 x3 = (55.95)2 + {56.00}? + (56.04)2 + (56.08)2 + (56.23)2 = 15713.6634 f = 280.35 = 56.06 15713.6634 — (280.3)25 ———‘s“:1—— = 0-10? |56.23 — 56.06130“)? = 1.59 5 To Table 3—4 indicates that the critical value of Tn for five measurements is greater than the experimental value at all confidence levels. Therefore, retention is also indicated by this test. Recommendations for the Treatment of Outliers Recommendations for the treatment of a small set of results that contains a suspect value follow: 1. Reexamine carefully all data relating to the outlying result to see if a gross error could have affected its value. This recommendation de- mands a properly kept laboratory notebook containing careful nota- tions of all observations. 2. If possible, estimate the precision that can be reasonably expected from the procedure to be sure that the outlying result actually is ques— tionable. Table 3—4 CRITICAL VALUES FOR REJECTION QUOTIENT 1",,“ Use caution when rejecting data for Number 0t TH any reason. Observations 95% Confidence 91.5% Confidence 99% Confidence 3 1.15 1.15 1.15 4 1.46 1.48 1.49 5 1.6? 1.71 1.75 6 1.82 1.89 1.94 7 1.94 2.02 2.10 8 2.03 2.13 2.22 9 2.11 2.21 2.52 m 2.18 2.29 2.41 ______________—__——__w__-—_____— *Adapted from J. Mandel, in Treatise on Analytical Chemistry, 2nd cd., I. M. Kolthoff and P11- ElViIlg, Eds, Part I, Vol. 1, p. 284. New York: Wiley, 1978. With permission of John WHEY & Sons, Inc. — 48 Chapter 3 Statistical Evaluation of Data In statistics a nail hypruhesis postu- [ates that two observed quzinlities are the same. The notation to indicate the 5% proba- bility level is P b 0.05. 3. Repeat the analysis if sufficient sample and time are available. Agree- ment betWeen the newly acquired data and those data of the original set that appear to be valid will lend weight to the notion that the outlying result should be rejected. Furthermore, if retention is still indicated, the questionable result will have a relatively small effect on the mean of the larger set of data. 4. If more data cannot be secured, apply the Q test or the 1",, test to the existing set to see if the doubtful result should be retained or rejected on statistical grounds. If the statistical test indicates retention, consider reporting the median of the set rather than the mean. The median has the great virtue of allowing inclusion of all data in a set without undue influence from an outlying value. In addition, the median of a normally distributed set containing three measurements provides a better estimate of the cor- rect value than the mean of the set after the outlying value has been discarded. 'J‘I The blind application of statistical tests to retain or reject a suspect measurement in a small set of data is not likely to be much more fruitful than an arbitrary decision. The application of good judgment based on broad experienCe with an analytical method is usually a sounder ap- proach. In the end, the Only valid reason for rejecting a result from a small set of data is the sure knowledge that a mistake was made in the measure- ment process. Without this knowledge, a cautious approach to rejection of an outlier is wise. 3B~3 Statistical Aids to Hypothesis Testing Much of scientific and engineering endeavor is based upon hypothesis testing. Thus. in order to explain an observation, a hypothetical model is advanced and tested experimentally to determine its validity. If the results from these experiments do not support the model, We reject it and seek a new hypothesis. If agreement is found, the hypothetical model serves as the basis for further experiments. When the hypothesis is sup- ported by sufficient experimental data, it becomes recognized as a useful theory until such time as data are obtained that refute it. Experimental results seldom agree exactly with those predicted from a theoretical model. Consequently, scientists and engineers frequently must judge whether a numerical difference is a manifestation of the inde~ terminate errors inevitable in all measurements. Certain statistical tests are useful in sharpening these judgments. Tests of this kind make use of a null hypothesis, which assumes that the numerical quantities being compared are, in fact, the same. The probabil— ity of the observed differences appearing as a result of indeterminate error is then computed from statistical theory. Usually, if the observed differ- ence is greater than or equal to the difference that would occur 5 times in 100 (the 5% probability level), the null hypothesis is considered question- able and the difference isjudged to be Significant. Other probability levels, such as 1 in 100 or 10 in 100, may also be adopted, depending upon the certainty desired in the judgment. pF—' BB The Uses of Statistics 49 The kinds of testing that chemists use most often include the compari- son of (1) the mean from an analysis .f with what is believed to be the true value ’11,; (2) the means i; and if; from two sets of analyses; (3) the standard deviations s, and .92 or 0-1 and or; from two sets of measurements; and (4) the standard deviation 5 of a small set of data with the standard deviation 0- ofa larger set of measurements. The sections that follow consider some of the methods for making these comparisons. Comparison of an Experimental Mean with a True Value A common way of testing for determinate errors in an analytical method is to use the method to analyze a sample whose composition is accurately known. It is likely that the experimental mean i will differ from the accepted value pt; the judgment must then be made whether this differ- ence is the consequence of indeterminate error or determinate error. In treating this type of problem statistically, the difference it — ,u is compared with the difference that could be caused by indeterminate error. If the observed difference is less than that computed for a chosen probability level, the null hypothesis that f and ,u. are the same cannot be rejected; that is, no significant determinate error has been demonstrated. It is important to realize, however, that this statement does not say that there is no determinate error; it says only that whatever determinate error is present is so small that it cannot be detected. If i — ,u is significantly larger than either the expected or the critical value, we may assume that the difference is real and that the determinate error is significant. The critical value for the rejection of the null hypothesis is calculated by rewriting Equation 3—10 in the form [3 f—#=ios (3—12) where N is the number of replicate measurements employed in the test. If a good estimate of o- is available, Equation 3—12 can be modified by replacing r with z and s with 0'. Example 3—8 A new procedure for the rapid determination of sulfur in kerosenes was tested on a sample known from its method of preparation to centain 0.123% S. The results were % S = 0.112, 0.118, 0.115, and 0.119. Do the data indicate that there 'is a determinate error in the method? 2x, = 0.112 + 0.118 + 0.115 + 0.119 = 0.464 f = 0.464i’4 = 0.116% S i h ,u. = 0.116 — 0.123 = —0.007% S 2 x} = 0.012544 + 0.013924 + 0.013225 + 0.014161 = 0.053854 0.053854 — (0.464JZI4 0.000030 5: —‘“—4—T—= —3—=0.0032 — 50 Chapter 3 Statistical Evaluation of Data If it was confirmed by further exper- iments that the method always gave low results, we would say that the method had a negative bias. Even if a mean value is shown to be equal to the true value at a given confi— dence level, we cannot conclude that there is no determinate error in the data. From Table 3—2, we find that at the 95% confidence level, I has a value of 3.18 for three degrees of freedom. Thus, _ts_ : 3.18 X 0.0032 = 430-0051 V2 V3 An experimental mean can be expected to deviate by 10.0051 or greater no more frequently than 5 times in 100. Thus, if we conclude that f — p. = #000? is a significant difference and that a determinate error is present, we will, on the average, be wrong fewer than 5 times in 100. If We make a similar calenlation employing the value for r at the 99% confidence level, £st assumes a value of 0.0093. Thus, if we insist upon being wrong no more often than 1 time in 100, we must conclude that no difference between the results has been demonstrated. Note that this statement is different from saying that there is no determinate error. Comparison of Two Experimental Means The results of chemical analyses are frequently used to determine whether two materials are identical. Here, the chemist must judge whether a difference in the means of two sets of measurements is real and constitutes evidence that the samples are different or whether the discrep- ancy is simply a consequence of indeterminate errors in the two sets. To illustrate, let us assume that N I replicate analyses of material l yielded a mean value of £1, and N2 analyses of material 2 obtained by the same method gave a mean of £2 . If the data were collected in an identical way, it is usually safe to assume that the standard deviations of the two sets of measurements are the same and modify Equation 3—12 to take into ac- count that one set of results is being compared with a second rather than with the true mean of the data, In. In this case, as with the previous one, we invoke the null hypothesis that the samples are identical and that the observed difference in the results, (it ~— $2), is the result of indeterminate errors. To test this hypoth- esis statistically, we modify Equation 3+12 in the following way. First, we substitute X: for it, thus making the left side of the equation the numerical differenCe betWeen the two means i1 — 222. Since we know from Equation 3—5 that the standard deviation of the mean 51 is S] . . . _ = and likewise for x;, Sml m S: 3‘ 2 m2 vii]; Thus the variance 5% of the difference (at = x] — x1) between the means is given by 2 2 5d = Sim ‘l' Sm: By substituting the values for 3,}, 5m], and 3",; into this equation, we have r- If we then assume that the pooled standard deviation $1,001“ is a good estimate of both 3m; and smz, then ( 5d )2 : (Si-tooled) + (Spooled)2 = 32 I d (N! + N2) m V N| V N2 p000 NIN2 and 5a _ /M ‘V—N — Spooled AHA!2 Substituting this equation into Equation 3—12 (and also £2 for it), we find that _ __ N] + N2 xl ‘— 1'2 : i tspooled N! N? The numerical value for the term on the right is computed using I for the particular confidence level desired. The number of degrees of freedom for finding t in Table 3—2 is N] + N2 — 2. Ifthe experimental difference it. — f; is smaller than the computed value, the null hypothesis is not rejected and no significant difference between the means has been demonstrated. An experimental difference greater than the value computed from 1 indicates that there is a significant difference between the means. If a good estimate of o- is available, Equation 3—12 can be modified by inserting z for t and o- for s. (34 3) Example 3—9 The composition ofa flake of paint found on the clothes of the victim of a hit-and-run accident was compared with that of paint from the car sus- pected of causing the accident. Do the following data for the spectro- scopic determination of titaniUm in the paint suggest a difference in com- position between the two materials? From previous experience, the standard deviation for the method is known to be 0.35% Ti; that is, s —) o- = 0.35% Ti. Paint from clothes % Ti 2 4.0, 4.6 Paint from car % Ti 2 4.5, 5.3, 5.5, 5.0, 4.9 _ 4.6 + 4.0 X] —- _—"—2 — _ 4.5 + 5.3 + 5.5 + 5.0 + 4.9 x2 : s = i1 - £2 = 4.3 ~ 5.0 = —0.7% Ti 38 The Uses of Statistics 51 52 Chapter 3 Statistical Evaluation of Data Modifying Equation 3—13 to take into account our knowledge that s —) er and taking values of z from Table 3—2, we calculate for the 95% confidence level NI+N2 2+5 12:0 W=:1.96><0.35 2 52:05? and for the 99% confidence level W, + N3 {2 + 5 "—i—ZU' NINE : X 2 X 5 = Only 5 out of 100 data should differ by 0.5?% Ti or greater, and only 1 out of 100 should differ by as much as 0.?6% Ti. Thus, it seems reasona- bly probable (between 95% and somewhat less than 99% certain) that the observed difference of —0.7% does not arise from indeterminate error but in fact is caused, at least in part, by a real difference between the two paint samples. Hence, we conclude that the suspected vehicle was proba- bly not involved. Example 3—10 Two barrels of wine were analyzed for their alcohol content in order to determine whether they were from differentsources. On the basis of six analyses, the average content of the first barrel was established to be 12.61% ethanol. Four analyses of the second barrel gave a mean of 12.53% alcohol. The ten analyses yielded a pooled value of 5 = 0.070%. Do the data indicate a difference between the wines? Here We employ Equation 3—13, using I for eight degrees of freedom {10 e 2). At the 95% confidence level, a + 4 its ——_N1N2 = 12.31 X 0.070 6 X 4 = :0.10% The observed difference is i; - f2 = 12.61 — 12.53 = 0.08% As often as 5 times in 100, indeterminate error will be responsible for a difference as great as 0.10%. At the 95% confidenCe level, then, no differ- ence in the alcohol content of the wine has been established. In Example 3—10, no significant difference between the two wines was detected at the 95% probability level. Note that this statement is not equivalent to saying that i, is equal to :22; nor do the tests prove that the wines come from the same source. Indeed. it is conceivable that one wine is a red and the other is a white. To establish with a reasonable probability that the two wines are fr0m the same source would require extensive testing of other characteristics, such as taste, color, odor. and refractive 38 The Uses of Statistics 53 index, as well as tartaric acid, sugar, and trace element content. If no significant differences are revealed by all of these tests and by others, then it might be possible to judge the two wines as having a common origin. In contrast, the finding of one significant difference in any test would clearly show that the two wines are different. Thus, the establish- ment of a significant difference by a single test is much more revealing than the establishment of an absence of difference. Estimation of Detection Limits Equation 343 is useful for estimating the detection limit for a measure- ment. Here, the standard deviation from several blank determinations is computed. The minimum detectable quantity Axmm is N1+N2 flxmin : iI _' it: > 15b N N: 1 {3—14) where the subscript (9 refers to the blank determination. Example 3—11 A method for the analysis of DDT gave the following results when applied to pesticide—free foliage samples: ,ug DDT = 0.2, —0.5, —0.2, 1.0, 0.8, -0.6, 0.4, 1.2. Calculate the DDT-detection limit (at the 99% confidence level) of the method for (a) a single analysis and (b) the mean of five analyses. Here we find Ex; = 0.2 — 0.5 -— 0.2 +1.0 + 0.8 — 0.6 + 0.4 +1.2 = 2.3 Ex? = 0.04 + 0.25 + 0.04 + 1.0 + 0.64 + 0.36 + 0.16 + 1.44 = 3.93 3.93 — (2.3)2i’8 3.26875 Sb= Tl—‘= 7 =0-68ng (a) For a single analysis, N. = 1 and the number of degrees of freedom is l + 8 — 2 = T. From Table 3—2, we find I = 3.50, and so 1 + 8 mm," > 3.50 x 0.68 01 X 8 > 2.5 ptg DDT Thus, 99 times out of 100, a result greater than 2.5 ,ug DDT indicates the presence of the pesticide on the plant. it?) Here, N] = 5, and the number of degrees of freedom is 11. Therefore, I = 3.11, and s + s Axum, > 3.11 X 0.68 5 X 8 > 1.2 ag DDT I. ...
View Full Document

Page1 / 20

Chapter3 - Statistics allows us to look at our data in...

This preview shows document pages 1 - 20. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online