class 1-ANOVA- PPT - BABS 540 Data Utilization Lecture 1...

This preview shows page 1 out of 7 pages.

Unformatted text preview: BABS 540 — Data Utilization Lecture 1: ANOVA What is Statistics? “The science of collecting, organizing, and interpreting data” • Moore, McCabe, Duckworth, and Alwan “...a way of reasoning, along with a collection of tools and methods, designed to help us understand the world.” • Sharpe, De Veaux, and Velleman
 
 What are statistics? Where do we use statistics? Marketing Finance Accounting Supply Chain Human Resources … pretty much every area of business. Statistics One of the main things we do in statistics is: Use a sample to infer something about a population. Statistics ... "the most important science in the whole world: for upon it depends the practical application of every other science and of every art; the one science essential to all political and social administration, all education, all organization based upon experience, for it only gives the results of our experience.” — Florence Nightingale Variation or Uncertainty The world is full of uncertainty. And of variation. How can we determine when there are real differences?
 Or real patterns?
 Or when we are simply observing natural variation? Three main points of BABS 540 1. Statistics is (are) everywhere. 2. Understanding data is vital to making good decisions in business (and life). 3. Working with data requires being able to match the right tool to the job. What does Mean Mean? Measures of central tendency: • mean • median • mode Source: economix.blogs.nytimes.com WWII Bombers Administrivia Expectations Syllabus Questions Dead Fish • Dead fish respond to human emotions (p < 0.001). Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon: An argument for multiple comparisons correction Craig M. Bennett1, Abigail A. Baird2, Michael B. Miller1, and George L. Wolford3 1 3 Psychology Department, University of California Santa Barbara, Santa Barbara, CA; 2 Department of Psychology, Vassar College, Poughkeepsie, NY; Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH INTRODUCTION GLM RESULTS With the extreme dimensionality of functional neuroimaging data comes extreme risk for false positives. Across the 130,000 voxels in a typical fMRI volume the probability of a false positive is almost certain. Correction for multiple comparisons should be completed with these datasets, but is often ignored by investigators. To illustrate the magnitude of the problem we carried out a real experiment that demonstrates the danger of not correcting for chance properly. METHODS Subject. One mature Atlantic Salmon (Salmo salar) participated in the fMRI study. The salmon was approximately 18 inches long, weighed 3.8 lbs, and was not alive at the time of scanning. Task. The task administered to the salmon involved completing an open-ended mentalizing task. The salmon was shown a series of photographs depicting human individuals in social situations with a specified emotional valence. The salmon was asked to determine what emotion the individual in the photo must have been experiencing. Design. Stimuli were presented in a block design with each photo presented for 10 seconds followed by 12 seconds of rest. A total of 15 photos were displayed. Total scan time was 5.5 minutes. Preprocessing. Image processing was completed using SPM2. Preprocessing steps for the functional imaging data included a 6-parameter rigid-body affine realignment of the fMRI timeseries, coregistration of the data to a T1 -weighted anatomical image, and 8 mm full-width at half-maximum (FWHM) Gaussian smoothing. Analysis. Voxelwise statistics on the salmon data were calculated through an ordinary least-squares estimation of the general linear model (GLM). Predictors of the hemodynamic response were modeled by a boxcar function convolved with a canonical hemodynamic response. A temporal high pass filter of 128 seconds was A t-contrast was used to test for regions with significant BOLD signal change during the photo condition compared to rest. The parameters for this comparison were t(131) > 3.15, p(uncorrected) < 0.001, 3 voxel extent threshold. Several active voxels were discovered in a cluster located within the salmon’s brain cavity (Figure 1, see above). The size of this cluster was 81 mm3 with a cluster-level significance of p = 0.001. Due to the coarse resolution of the echo-planar image acquisition and the relatively small size of the salmon brain further discrimination between brain regions could not be completed. Out of a search volume of 8064 voxels a total of 16 voxels were significant. Identical t-contrasts controlling the false discovery rate (FDR) and familywise error rate (FWER) were completed. These contrasts indicated no active voxels, even at relaxed statistical thresholds (p = 0.25). Evidence What is “evidence”? How much evidence is enough evidence?
 
 
 
 
 
 
 What does “inference” mean? What is a hypothesis test? Simple Review Example #1 Bottling Line A: Is the volume of whisky different from the intended 750 ml? Questions: • What hypothesis test should we use? • What is the null hypothesis? • After conducting this test, what is the conclusion? Example #2 Are Lines A and B different? Questions: • What hypothesis test should we use? • What is the null hypothesis? • After conducting this test, what is the conclusion? Is There a 3-Sample t-Test? Are Lines A, B, and C different? Questions: • What hypothesis test should we use? • What is the null hypothesis? • After conducting this test, what is the conclusion? ANOVA ANOVA = ANalysis Of VAriance • Intended for comparing three or more groups in a singlefactor experiment. • Can also (cautiously) be used for observational data. • Used to answer the question: Is at least one of the means different? H0: µ1 = µ2 = … = µk HA: At least one mean is different. ANOVA works by calculating Mean Square Treatment divided by Mean Square Error, then looking this ratio up in the appropriate F-distribution. (See Excel output) • • • ANOVA Assumptions ✓ Independence Assumption — The groups must be independent of each other. ➡ ✓ Equal Variance Assumption — The true variances of the groups are equal. ➡ ✓ Randomization? Check the box plots to see if the spreads are close enough. Normal Population Assumption — The residuals are “nearly Normal”… but, thanks to the CLT, the larger the samples the less this matters. • Note that the residual for each value is the distance !om that value to the group average (+ or –). Marketing Example Average dollars spent on schamazon.com under each of three possible themes. A. Schamazon.com B. Schamazon.com Toys Tools Trucks C. Toys Tools Trucks Schamazon Healthcare Example What examples can you think of in healthcare? Other examples where ANOVA might be used? ANOVA Beyond BABS 540 Two-way ANOVA. Data transformations. Applications. ...
View Full Document

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture