This preview shows pages 1–4. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: STAT E50  Introduction to Statistics OneWay Analysis of Variance To compare two or more groups: • To test two proportions, use a ztest • To test more than two proportions, use a χ 2 test • To test two means, use a ttest • To test more than two means, use Analysis of Variance (ANOVA) In Analysis of Variance, we are testing for the equality of the means of several levels of a variable. The technique is to compare the variation between the levels and the variation within each level. (The levels of the variable are also referred to as groups, or treatments.) If the variation due to the level (variation between levels) is significantly larger than the variation within each level, then we can conclude that the means of the levels are not all the same. We will test the hypothesis H : μ 1 = μ 2 = … = μ k vs. H a : the means are not all equal using the ratio: differences among means variability within the groups When the numerator is large compared to the denominator, we will reject the null hypothesis. • The numerator measures the variation between groups; this is called the Treatment Mean Square, or MS T • The denominator measures the variation within groups; this is called the Error Mean Square, or MS E • The test statistic is T E MS F = MS ; reject H when F is large. • MS T has k  1 degrees of freedom, where k = the number of groups • MS E has k(n  1) degrees of freedom, where n = the number of observations from each group Assumptions and Conditions First plot the data in sidebyside boxplots. Look for outliers, similar spreads, similar centers. Independence Assumption The groups must be independent of each other. The data within each group must be independent data drawn at random from a homogeneous population, or generated by a randomized comparative experiment. Randomization Condition: Was the data collected randomly? Equal Variance Assumption We need the variances of the treatment groups to be equal, so we can find a pooled variance Similar Variance Condition: • Compare the spreads of the groups in the sidebyside boxplots • Are the spreads changing systematically with the centers? • Plot the residuals vs. the predicted values to be sure you don’t have larger residuals for larger values Normal Population Assumption Nearly Normal Condition: Compare the boxplots for skewness Pool the residuals and create a histogram or NPP The ANOVA table: Source df SS MS F p Factor Error Total T T T SS MS = df E E E SS MS = df T E MS F = MS Page 2 Suppose that these are the ages of people attending a large family reunion....
View
Full
Document
This note was uploaded on 05/20/2008 for the course STAT 50 taught by Professor Weinstein during the Spring '08 term at Harvard.
 Spring '08
 WEINSTEIN
 Statistics, Variance

Click to edit the document details