EXST7005 Fall2010 17a ANOVA

EXST7005 Fall2010 17a ANOVA - Statistical Methods I (EXST...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistical Methods I (EXST 7005) Page 116 Analysis of Variance (ANOVA) R. A. Fisher – resolved a problem that had existed for some time. The hypothesis to be tested is H0: μ1 = μ2 = μ3 = ... = μk versus the alternative H1: some μ is different. Conceptually, we have separate i (and independent) samples, each giving a mean, and we want to know if they could have all come from the same population, or if is more likely that at least one came from a different population. One way to do this is a series of t-tests. If we want to test among 3 means we do 3 tests: 1 versus 2, 1 versus 3, 2 versus 3 For 4 means there are 6 tests. 1–2, 1–3, 1–4, 2–3, 2–4, and 3–4 For 5 means, 10 tests, etc. This technique is unwieldy, and has other issues. When we do the first test, there is an α chance of error, and for each additional test another α chance of error. So if you do 3 or 6 or 10 tests, the chance of error on each and every test is α. Overall, for the experiment, the chance of error for all tests together is much higher than α. Bonferroni gave a formula that showed that the chance of error would be NO MORE than Σαi. So if we do 3 tests, each with a 5% chance of error, the overall probability of error is no greater than 15%, 30 percent for 6 tests, 50% for 10 tests, etc. Of course this is an upper bound. Other calculations are probably more realistic such as the calculation α ′ = 1 − (1 − α ) k −1 used by Duncan or α ′ = 1 − (1 − α ) k 2 from the Student- Newman-Keuls calculation (where k is the number of groups to be tested, α is the error rate for each test and α′ is the error rate for the collection of tests). The table below gives some probabilities of error calculated by Bonferroni’s, Duncan’s and Student-Newman-Keuls’ formulas for tests done at α = 0.05. Number of means 2 3 4 5 6 7 10 50 Pairwise tests 1 3 6 10 15 21 45 1225 (1–α) 0.95 0.86 0.74 0.6 0.46 0.34 0.1 0 Duncan Bonferroni (upper bound) [1–(1– α)k–1] 0.05 0.0500 0.15 0.0975 0.30 0.1426 0.50 0.1855 0.75 0.2262 1.05 0.2649 2.25 0.3698 61.25 0.9190 Student-NewmanKeuls [1–(1– α)k/2] 0.0500 0.0741 0.0975 0.1204 0.1426 0.1643 0.2262 0.7226 The bottom line: Splitting an experiment into a number of smaller tests is generally a poor idea. This applies at higher levels as well (i.e. splitting big ANOVAs into little ones). The solution: We need ONE test that will give us an accurate test with an α value of the desired level. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 117 The concept n 2 ∑ (Y − Y ) i SS We are familiar with variance. S 2 = i = 1 = d.f. n −1 2 γ 1S1 + γ 2S2 SS1 + SS2 2 = We are familiar with the pooled variance S = (n1 − 1) + (n 2 − 1) γ 1 +γ 2 2 p We are familiar with the variance of the means. But we never get “multiple” estimates of the mean and calculate a variance from those. The calculation we use to get the variance of the 2 means comes from statistical theory, S Y2 = S . Could we actually get multiple estimates of n the means and calculate a sum of squared deviations of the various means from an overall mean and get variance of the means from that? k 2 ∑ (Yi. − Y ..) S 2 = i =1 Yes, we could, and using the formula SY = n k −1 2 should give the same value. Suppose we have some values from a number of different samples, perhaps taken at different sites. The values would be Yij, where the sites are i=1, 2, ..., k, and the observations from within the sites are j = 1, 2, 3, ..., ni. For each site we calculate a value of the mean. We then take the various means (k different means) and calculate a variance among those. This would also give the “variance of the means”. The LOGIC Remember, we want to test H0: μ1 = μ2 = μ3 = ... = μk We have a bunch of means and we want to know if they were drawn from the same population or different populations. We also have a bunch of samples each with its own variance (S2). If we can assume homogeneous variance (all variances equal) then we could POOL the multiple estimates of variance. So, to start with we will take the variances from each of the groups and pool them into one new & improved estimate of variance. This will be the very best estimate of variance that we will get (if the assumption is met). 2 Sp = SS1 + SS 2 + SS 3 + SS 4 + SS 5 ( n1 − 1) + ( n2 − 1) + ( n3 − 1) + ( n4 − 1) + ( n5 − 1) Now, think about the means. If the NULL HYPOTHESIS IS TRUE, then we could calculate the variance of the means from the multiple Y means. This would estimate S Y2 , the Means variance of the means. We would take the deviations of each Yi. from the overall Y mean, Y.. , and get a variance from that. Deviations A B C D E Group James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 118 If the null hypothesis is true, the means should be pretty close to the overall mean. They won't be exactly equal to the overall mean because of random sampling variation in the individual observations. Y Y A B C D E Group However, if the null hypothesis is false, then some mean will be different! At least one, maybe several. Y Y A B C D E Group So we take the Sum of squared deviations, divide by the degrees of freedom and we get an k ∑ (Yi. − Y ..) 2 estimate of the variance of the means, SY = i =1 k −1 2 . But this does not exactly estimate the variance, it estimates the variance of the means, that is the variance divided by the sample k ∑ (Yi. − Y ..) 2 size! The sample size is the number of observations in each mean. SY = i =1 2 k −1 = S2 . n In order to estimate the variance we must multiply this estimate by n, the sample size, nS Y2 = nS 2 = S 2 , giving a second estimate of the variance. This is obviously easier if each n sample size is the same (i. e. the experiment is balanced). We will usually use the calculations for a balanced design, but the analysis can readily be done if the data is not balanced. It's just a little more complicated. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 119 The Solution So what have we got? One variance estimate that is pooled across all of the samples because the variances are equal (an assumption, sometimes testable). This is the best estimate of random error. And another variance that should be the same IF the null hypothesis is TRUE. The second mean (from the variances) may not be the same if the null hypothesis is false, depending on how great the departure from the null hypothesis. Not only will the second variance from the mean not be the same, IT WILL BE LARGER! Why? Because when we are testing means for equality we will not consider rejecting if the means are too similar, only if they are too different and large differences in means yield large deviations which produce an overly large variance. So this will be a one tailed test. And how to we go about testing these two variances for equality? Testing for equality of variances requires an F-test, of course. If 2 H0: μ1 = μ2 = μ3 = ... = μk is true, then S p = nSY2 2 2 If H1: some μi is different, then S p < nSY For a one tailed F test we put the ONE WE EXPECT TO BE LARGER IN THE NUMERATOR. F= 2 nSY 2 Sp And that is Analysis of Variance. We are actually testing means, but we are doing it by turning them into variances; one pooled variance from within the groups, called the “pooled within variance” and one variance from between groups or among groups called the “variance among groups” or “between group variance”. If the variances are not significantly different as judged by the F test, then we cannot reject the null hypothesis. It is possible, as usual, that we make a Type II error with some unknown probability (β). If the variances are judged to not be the same, then the null hypothesis is probably not true. Of course we may have made a Type I error, with a known probability of α. Some of the calculations later, but this is the basic idea. R. A. Fisher Ronald Aylmer Fisher is sometimes called the father of modern statistics. Some of his major contributions include the development of the basics of design of experiments and Analysis of Variance. Born in London 1890, he had very poor eyesight that prevented him from learning by electric light. He had to learn by having things read out to him. He developed the ability view problems geometrically and to figure mathematical equations in his head. In 1909 he won a scholarship to Cambridge. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 120 He left an academic position teaching mathematics for a position at Rothamsted Agricultural Experiment Station. In this environment he developed many applied analyses for testing experimental hypotheses (Analysis of Variance, circa 1918), and provided much of the foundation for modern statistics. We will see other analyses (in addition to ANOVA) developed by Fisher. Some other contributions by Fisher include the first use of the term “null hypothesis”, development of the F distribution, of the Least Significant Difference, maximum likelihood estimation and contributed to the early use nonparametric statistics. Terminology used in Analysis of Variance Treatment – different experimental populations that are contained in an experiment and undergo some application or manipulation by the experimenter Control or check – a “treatment” that receives no experimental manipulation Experimental Unit – the unit to which a treatment is applied Sampling Unit – the unit that is sampled or measured The linear model is given by Yij = μi + εij or Yij = μ +τ i + εij ˆ where τi = (μi − μ. ) is estimated by τ i = (Yi. − Y.. ) The calculation of treatment Sum of Squares for treatments is a sum of the squared treatment t 2 effects SSTreatments = n∑ (Yi. − Y.. ) . i =1 The calculation of treatment Mean Square is a sum of squared effects divided by the degrees of t freedom. A variance? MSTreatments = n ∑ (Yi. − Y.. ) i =1 t −1 2 t = n∑ τ i2 i =1 t −1 A random treatment effect estimates a variance component. In order for treatments to be random, they should be a random selection from a large (theoretically ∞) number of treatments. Inferences developed from random treatments are for all the possible treatment levels. Examples of random effects The term used for the error in an experiment are always random. They represent random variation. This variation comes from the experimental unit and sometimes the sampling unit. Compare production rice varieties, where rice varieties represent a random sample from the world's rice varieties. Estimate the alcohol content of beer, where the beers tested are randomly sampled from all the beers in the population of interest (world, national). James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 121 Oxygen levels in bayous, where randomly selected bayous represent all bayous in the state. A treatment is FIXED if all possible levels, or all levels of interest, are included in the experiment. The treatment levels are selected by the investigator and are probably not chosen from a very large number of possible values. A fixed treatment estimates the sum of squared fixed effects for the treatments being investigated. t This is NOT a variance, but the calculation is the same, ∑τ i i =1 2 t −1 . Examples of fixed effects • Experiment includes all of the 7 rice varieties commonly grown in Louisiana • Beers are limited to the 5 micro-breweries in Anchorage, Alaska. There are some treatments that are common and typically fixed. For example, indicator variables that include all possible levels of a treatment. • Sex (male, female) • Class (Freshman, Sophomore, Junior, Senior) Ordinal treatments, where data are categorized as large, medium & small or as deep & shallow or as a high level & a low level. Before and After Treatments with a control group If the treatments are fixed, then inferences are limited to the treatments included in the experiment. Comparing the individual treatment levels is often of interest, since they are often specifically chosen by the investigator. If the treatments are random inferences are made to the whole population that the sample was taken from. Individual “treatment levels” are usually not of interest. Some treatments can be either fixed or random. Years – do they represent random variation or do we categorize as “wet & dry” Months – randomly selected, or do they represent seasons Sites or locations – random chosen or selected for certain characterisitics? Example A new insulin preparation is being compared to an older standard and a saline control. Ten rabbits are administered the preparations and blood sugar is measured after 20 minutes. What are the treatments? Experimental units? Sampling units? Are the treatments fixed or random? CRD – the Completely Randomized Design The analysis of variance we have seen is called the Completely Randomized Design (CRD) because the treatments are assigned to the experimental units completely at random. The analysis is also called a “one-way analysis of variance”. Later we will discuss the Randomized Block Design (RBD). James P. Geaghan Copyright 2010 ...
View Full Document

This note was uploaded on 12/29/2011 for the course EXST 7005 taught by Professor Geaghan,j during the Fall '08 term at LSU.

Ask a homework question - tutors are online