ANOVA 13 (Comparison of Three or more means) Previously we looked at 2-sample Z and 2-sample T tests, which can be used to test if two samples came from populations with the same mean. In this chapter, we will learn a test to check if three or more samples are from populations with the same mean. As an example we will use the following data. Assume the treatments are three antiseptics and the observed values are bacterial counts. Treatment 1 2 3 20 24 35 observed values 32 27 25 30 20 22 22 28 27 mean 26 24 28 sx 4.9666 4.3205 5.3541 = SS 74 56 86 ( x x) 2 The ANOVA (ANalysis Of VAriance) test may seem a little surprising at first since the statistic does not explicitly contain any means, but rather it uses a comparison of two estimates of variation and thus has an F distributiom. This is the same distribution we used when testing if two samples came from populations with the same variance (the 2sample F test). The F statistic will be equivalent to the following: F sample variance based on variation among the sample means = sample variance based on pooling the sample variances = ns s 2 x 2 p The following is a brief description of the reasoning used in forming this ratio ______________________________ ________________________________________ Assuming all three samples come from a single population we have two ways to estimate this 2. 1) Average of the three sample variances. s 2 p 2 S12 + S 2 + S 32 3 = s 2 p is called the within sample or pooled estimate of 2. 2) Use the sample variance of the three sample means. We will assume x is distributed as a 2 x = 2/n . N(, / n ) , so the variance of x is given by: n x = 2 . The population variance of x can be approximated the by sample variance of the x s, which is: 2 Multiplying both sides of this equation by n yields: 2 2 2 2 S x = (x 1 - x ) + (x 2 - x ) + (x 3 - x ) 3 -1 with x = x 1 + x 2 + x3 3 2 Thus n S x 2. This second estimate is called the between sample estimate of 2 When we take the samples from the same population both the within and between sample estimates are unbiased estimates of 2 and should be approximately equal to each other; thus their ratio, ns s 2 2 x , p Should be close to one. . For the above data x = 26 Thus the ratio F = ns s 2 s s x 2 p = [ (24-26)2 + (26-26)2 + (28-26)2 ]/2 = 4 = (74 + 56 + 86) / 3 3 = 216 9 = 24 2 x 2 = (4*4)/24 = 2/3. This F statistic is not statistically significant at either the p 1% or 5% levels. Pvalue = Fcdf(2/3,100,2,9) = .5370 These calculations can be put into a table as follows Source of variation Degrees of Freedom Sum of Squares Treatments 2 32 16 Error 9 216 24 Total 11 248 Requirements of the ANOVA test 1) samples are random and independent 2) sampled populations are normally distributed 3) population variances are equal Mean Square F 2/3 Pval .5370 = p(F2/3) with 2/9 df Practice ANOVA Question A department store manager wants to compare the sales of three clerks so she finds the number of sales that each clerk has on four consecutive days Clerk A B C Number of Sales 44 50 53 55 53 54 48 54 57 45 55 56 Perform an Analysis of Variance on this data to test: Ho: A = B = C Ha: the means are not all equal Use = .05 State if you accept or reject Ho Source of variation Degrees of Freedom Sum of Squares Mean Square Treatments 2 104 52 Error 9 98 Total 11 202 10.89 F Pval 4.776 .0386
