Answers (Chapter 8) - Discovering Statistics Using SPSS:...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Discovering Statistics Using SPSS: Chapter 8 Chapter 8: Answers Task 1 Imagine that I was interested in how different teaching methods affected students’ knowledge. I noticed that some lecturers were aloof and arrogant in their teaching style and humiliated anyone who asked them a question, while others were encouraging and supporting of questions and comments. I took three statistics courses where I taught the same material. For one group of students I wandered around with a large cane and beat anyone who asked daft questions or got questions wrong (punish). In the second group I used my normal teaching style which is to encourage students to discuss things that they find difficult and to give anyone working hard a nice sweet (reward). The final group I remained indifferent to and neither punished nor rewarded their efforts (indifferent). As the dependent measure I took the students’ exam marks (percentage). Based on theories of operant conditioning, we expect punishment to be a very unsuccessful way of reinforcing learning, but we expect reward to be very successful. Therefore, one prediction is that reward will produce the best learning. A second hypothesis is that punishment should actually retard learning such that it is worse than an indifferent approach to learning. The data are in the file Teach.sav carry out a one-way ANOVA and use planned comparisons to test the hypotheses that (1) reward results in better exam results than either punishment or indifference; and (2) indifference will lead to significantly better exam results than punishment. SPSS Output Descriptives Exam Mark N Punish Indifferent Reward Total 10 10 10 30 Mean 50.0000 56.0000 65.4000 57.1333 Std. Deviation 4.13656 7.10243 4.29987 8.26181 95% Confidence Interval for Mean Lower Bound Upper Bound 47.0409 52.9591 50.9192 61.0808 62.3241 68.4759 54.0483 60.2183 Std. Error 1.30809 2.24598 1.35974 1.50839 Minimum 45.00 46.00 58.00 45.00 Maximum 57.00 67.00 71.00 71.00 This output shows the table of descriptive statistics from the one-way ANOVA; we’re told the means, standard deviations, and standard errors of the means for each experimental condition. The means should correspond to those plotted in the graph. These diagnostics are important for interpretation later on. It looks as though marks are highest after reward and lowest after punishment. Test of Homogeneity of Variances Exam Mark Levene Statistic 2.569 df1 df2 2 27 Sig. .095 The next part of the output reports a test of the assumption of homogeneity of variance (Levene’s test). For these data, the assumption of homogeneity of variance has been met, because our significance is 0.095, which is bigger than the criterion of 0.05. ANOVA Exam Mark Between Groups Within Groups Total Dr. Andy Field Sum of Squares 1205.067 774.400 1979.467 df 2 27 29 Mean Square 602.533 28.681 Page 1 F 21.008 Sig. .000 8/13/2003 Discovering Statistics Using SPSS: Chapter 8 The main ANOVA summary table shows us that because the observed significance value is less than 0.05 we can say that there was a significant effect of teaching style on exam marks. However, at this stage we still do not know exactly what the effect of the teaching style was (we don’t know which groups differed). Robust Tests of Equality of Means Exam Mark a Welch Brown-Forsythe Statistic 32.235 21.008 df1 df2 17.336 20.959 2 2 Sig. .000 .000 a. Asymptotically F distributed. This table shows the Welch and Brown-Forsythe Fs, but we can ignore these because the homogeneity of variance assumption was met. Contrast Coefficients Contrast 1 2 Type of Teaching Method Punish Indifferent Reward 1 1 -2 1 -1 0 Because there were specific hypotheses I specified some contrasts. This table shows the codes I used. The first contrast compares reward (coded with -2) against punishment and indifference (both coded with 1). The second contrast compares punishment (coded with 1) against indifference (coded with -1). Note that the codes for each contrast sum to zero, and that in contrast 2, reward has been coded with a 0 because it is excluded from that contrast. Contrast Tests Exam Mark Assume equal variances Does not assume equal variances Contrast 1 2 1 2 Value of Contrast -24.8000 -6.0000 -24.8000 Std. Error 4.14836 2.39506 3.76180 t -5.978 -2.505 -6.593 27 27 21.696 Sig. (2-tailed) .000 .019 .000 -6.0000 2.59915 -2.308 14.476 .036 df This table shows the significance of the two contrasts specified above. Because homogeneity of variance was met, we can ignore the part of the table labelled does not assume equal variances. The t-test for the first contrast tells us that reward was significantly different from punishment and indifference (it’s significantly different because the value in the column labelled Sig. is less than 0.05). Looking at the means this tells us that the average mark after reward was significantly higher than the average mark for punishment and indifference combined. The second contrast (and the descriptive statistics) tells us that the marks after punishment were significantly lower than after indifference (again, it’s significantly different because the value in the column labelled Sig. is less than 0.05). As such we could conclude that reward produces significantly better exam grades than punishment and indifference, and that punishment produces significantly worse exam marks than indifference. So lecturers should reward their students not punish them! Calculating the Effect Size The output provides us with three measures of variance: the between group effect (SSM), the within subject effect (SSR) and the total amount of variance in the data (SST). We can use these to calculate omega squared (ω2): Dr. Andy Field Page 2 8/13/2003 Discovering Statistics Using SPSS: Chapter 8 MS − MS R ω 2 = MS + (M −1)×MS ) M (n R ω2 = 602.53 − 28.68 602.53 + ((10 − 1 )× 28.68 ) 573.85 602.53 + 258.12 = 0.67 = ω = 0.67 = 0.82 For the contrasts the effect sizes will be: rcontrast = t2 t + df 2 − 5.978 2 − 5.978 2 + 27 = 0.75 rcontrast 1 = If you think back to our benchmarks for effect sizes this represents a huge effect (it is well above 0.5—the threshold for a large effect). Therefore, as well as being statistically significant, this effect is large and so represents a substantive finding. For contrast 2 we get: − 2.505 2 − 2.505 2 + 27 = 0.43 rcontrast 2 = This too is a substantive finding and represents a medium to large effect size. Interpreting and Writing the Result The correct way to report the main finding would be: All significant values are reported at p < .05.There was a significant effect of teaching style on exam marks, F(2, 27) = 21.01, ω = .82. Planned contrasts revealed that reward produced significantly better exam grades than punishment and indifference, t(27) = –5.98, r = .75, and that punishment produced significantly worse exam marks than indifference, t(27) = –2.51, r = .43. Task 2 In Chapter 11 (section 11.4) there are some data looking at whether eating Soya meals reduces your sperm count. Have a look at this section, access the data for that example, but analyse them with ANOVA. What’s the difference between what you find and what is found in section 11.4? Why do you think this difference has arisen? SPSS Output Dr. Andy Field Page 3 8/13/2003 Discovering Statistics Using SPSS: Chapter 8 Descriptives Sperm Count (Millions) N No Soya Meals 1 Soya Meal Per Week 4 Soyal Meals Per Week 7 Soya Meals Per Week Total 20 20 20 20 80 Mean 4.9868 4.6052 4.1101 1.6530 3.8388 Std. Deviation 5.08437 4.67263 4.40991 1.10865 4.26048 95% Confidence Interval for Mean Lower Bound Upper Bound 2.6072 7.3663 2.4184 6.7921 2.0462 6.1740 1.1341 2.1719 2.8906 4.7869 Std. Error 1.13690 1.04483 .98609 .24790 .47634 Minimum .35 .33 .40 .31 .31 Maximum 21.08 18.47 18.21 4.11 21.08 This output shows the table of descriptive statistics from the one-way ANOVA. It looks as though as Soya intake increases, sperm counts do indeed decrease. Test of Homogeneity of Variances Sperm Count (Millions) Levene Statistic 5.117 df1 df2 3 Sig. .003 76 The next part of the output reports a test of the assumption of homogeneity of variance (Levene’s test). For these data, the assumption of homogeneity of variance has been broken, because our significance is 0.003, which is smaller than the criterion of 0.05. In fact, these data also violate the assumption of normality (see the Chapter on nonparametric statistics). ANOVA Sperm Count (Millions) Between Groups Within Groups Total Sum of Squares 135.130 1298.853 1433.983 df 3 76 79 Mean Square 45.043 17.090 F 2.636 Sig. .056 The main ANOVA summary table shows us that because the observed significance value is greater than 0.05 we can say that there was no significant effect of Soya intake on men’s sperm count. This is strange because if you read the chapter on nonparametric statistics from where this example came, the Kruskal-Wallis test produced a significant result! The reason for this difference is that the data violate the assumptions of normality and homogeneity of variance. As I mention in the chapter on nonparametric statistics, although parametric tests have more power to detect effects when their assumptions are met, when their assumptions are violated nonparametric tests have more power! This example was arranged to prove this point: because the parametric assumptions are violated, the nonparametric tests produced a significant result and the parametric test did not because, in these circumstances, the nonparametric test has the greater power! Robust Tests of Equality of Means Sperm Count (Millions) a Welch Brown-Forsythe Statistic 6.284 2.636 df1 3 3 df2 34.657 58.236 Sig. .002 .058 a. Asymptotically F distributed. This table shows the Welch and Brown-Forsythe Fs, note that the Welch test agrees with the nonparametric test in that the significance of F is below the 0.05 threshold. However, the Brown-Forsythe F is non-significant (it is just above the threshold). This illustrates the relative superiority of the Welch procedure. However, in these circumstances because normality and homogeneity of variance have been violated we’d use a nonparametric test anyway! Dr. Andy Field Page 4 8/13/2003 Discovering Statistics Using SPSS: Chapter 8 Task Three Students (and lecturers for that matter) love their mobile phones, which is rather worrying given some recent controversy about links between mobile phone use and brain tumours. The basic idea is that mobile phones emit microwaves, and so holding one next to your brain for large parts of the day is a bit like sticking your brain in a microwave oven and selecting the ‘cook until well done’ button. If we wanted to test this experimentally, we could get 6 groups of people and strap a mobile phone on their heads (that they can’t remove). Then, by remote control, we turn the phones on for a certain amount of time each day. After 6 months, we measure the size of any tumour (in mm3) close to the site of the phone antennae (just behind the ear). The six groups experienced 0, 1, 2, 3, 4 or 5 hours per day of phone microwaves for 6 months. The data are in Tumour.sav. (From Field & Hole, 2003, so there is a very detailed answer in there). SPSS Output The error bar chart of the mobile phone data shows the mean size of brain tumour in each condition, and the funny ‘I’ shapes show the confidence interval of these means. Note that in the control group (0 hours), the mean size of the tumour is virtually zero (we wouldn’t actually expect them to have tumour) and the error bar shows that there was very little variance across samples. We’ll see later that this is problematic for the analysis. Descriptives Size of Tumour (MM cubed) N 0 1 2 3 4 5 Total 20 20 20 20 20 20 120 Mean .0175 .5149 1.2614 3.0216 4.8878 4.7306 2.4056 Std. Deviation .01213 .28419 .49218 .76556 .69625 .78163 2.02662 Std. Error .00271 .06355 .11005 .17118 .15569 .17478 .18500 95% Confidence Interval for Mean Lower Bound Upper Bound .0119 .0232 .3819 .6479 1.0310 1.4917 2.6633 3.3799 4.5619 5.2137 4.3648 5.0964 2.0393 2.7720 Minimum .00 .00 .48 1.77 3.04 2.70 .00 Maximum .04 .94 2.34 4.31 6.05 6.14 6.14 This output shows the table of descriptive statistics from the one-way ANOVA; we’re told the means, standard deviations, and standard errors of the means for each experimental condition. The means should correspond to those plotted in the graph. These diagnostics are important for interpretation later on. Dr. Andy Field Page 5 8/13/2003 Discovering Statistics Using SPSS: Chapter 8 Test of Homogeneity of Variances Size of Tumour (MM cubed) Levene Statistic 10.245 df1 df2 114 5 Sig. .000 The next part of the output reports a test of this assumption, Levene’s test. For these data, the assumption of homogeneity of variance has been violated, because our significance is 0.000, which is considerably smaller than the criterion of 0.05. In these situations, we have to try to correct the problem and we can either transform the data or choose the Welch F. ANOVA Size of Tumour (MM cubed) Between Groups Within Groups Total Sum of Squares 450.664 38.094 488.758 df 5 114 119 Mean Square 90.133 .334 F 269.733 Sig. .000 The main ANOVA summary table shows us that because the observed significance value is less than 0.05 we can say that there was a significant effect of mobile phones on the size of tumour. However, at this stage we still do not know exactly what the effect of the phones was (we don’t know which groups differed). Robust Tests of Equality of Means Size of Tumour (MM cubed) a Welch Brown-Forsythe Statistic 414.926 269.733 df1 5 5 df2 44.390 75.104 Sig. .000 .000 a. Asymptotically F distributed. This table shows the Welch and Brown-Forsythe Fs, which are useful because homogeneity of variance was violated. Luckily our conclusions remain the same; both Fs have significance values less than 0.05. Multiple Comparisons Dependent Variable: Size of Tumour (MM cubed) Games-Howell (I) Mobile Phone Use (Hours Per Day) 0 1 2 3 4 5 (J) Mobile Phone Use (Hours Per Day) 1 2 3 4 5 0 2 3 4 5 0 1 3 4 5 0 1 2 4 5 0 1 2 3 5 0 1 2 3 4 Mean Difference Std. Error (I-J) -.4973* .18280 -1.2438* .18280 -3.0040* .18280 -4.8702* .18280 -4.7130* .18280 .4973* .18280 -.7465* .18280 -2.5067* .18280 -4.3729* .18280 -4.2157* .18280 1.2438* .18280 .7465* .18280 -1.7602* .18280 -3.6264* .18280 -3.4692* .18280 3.0040* .18280 2.5067* .18280 1.7602* .18280 -1.8662* .18280 -1.7090* .18280 4.8702* .18280 4.3729* .18280 3.6264* .18280 1.8662* .18280 .1572 .18280 4.7130* .18280 4.2157* .18280 3.4692* .18280 1.7090* .18280 -.1572 .18280 Sig. .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .984 .000 .000 .000 .000 .984 95% Confidence Interval Lower Bound Upper Bound -.6982 -.2964 -1.5916 -.8960 -3.5450 -2.4631 -5.3622 -4.3783 -5.2653 -4.1608 .2964 .6982 -1.1327 -.3603 -3.0710 -1.9424 -4.8909 -3.8549 -4.7908 -3.6406 .8960 1.5916 .3603 1.1327 -2.3762 -1.1443 -4.2017 -3.0512 -4.0949 -2.8436 2.4631 3.5450 1.9424 3.0710 1.1443 2.3762 -2.5607 -1.1717 -2.4429 -.9751 4.3783 5.3622 3.8549 4.8909 3.0512 4.2017 1.1717 2.5607 -.5455 .8599 4.1608 5.2653 3.6406 4.7908 2.8436 4.0949 .9751 2.4429 -.8599 .5455 *. The mean difference is significant at the .05 level. Dr. Andy Field Page 6 8/13/2003 Discovering Statistics Using SPSS: Chapter 8 Because there were no specific hypotheses I just carried out post hoc tests and stuck to my favourite Games-Howell procedure (because variances were unequal). It is clear from the table that each group of participants is compared to all of the remaining groups. First, the control group (0 hours) is compared to the 1-hour, 2-hour, 3-hour, 4-hour and 5-hour groups and reveals a significant difference in all cases (all the values in the column labeled Sig. are less than 0.05). In the next part of the table, the 1-hour group is compared to all other groups. Again all comparisons are significant (all the values in the column labeled Sig. are less than 0.05). In fact, all of the comparisons appear to be highly significant except the comparison between the 4-hour and 5-hour groups, which is non-significant because the value in the column labeled Sig. Is bigger than 0.05. Calculating the Effect Size The output provides us with three measures of variance: the between group effect (SSM), the within subject effect (SSR) and the total amount of variance in the data (SST). We can use these to calculate omega squared (ω2): MS − MS R ω 2 = MS + (M −1)× MS ) M (n R ω2 = 90.13 − 0.33 90.13 + ((20 − 1 )× 0.33 ) 89.8 90.13 + 6.27 = 0.93 = ω = 0.93 = 0.96 Interpreting and Writing the Result We could report the main finding as: • Levene’s test indicated that the assumption of homogeneity of variance had been violated (F (5, 114) = 10.25, p < .001). Transforming the data did not rectify this problem and so F-tests are reported nevertheless. The results show that using a mobile phone significantly affected the size of brain tumour found in participants (F(5, 114) = 269.73, p < .001, r = .96). The effect size indicated that the effect of phone use on tumour size was substantial. The next thing that needs to be reported are the post hoc comparisons. It is customary just to summarise these tests in very general terms like this: • Games-Howell post hoc tests revealed significant differences between all groups (p < .001 for all tests) except between 4- and 5-hours (ns).’ If you do want to report the results for each post hoc test individually, then at least include the 95% confidence intervals for the test as these tell us more than just the significance value. In this example though when there are many tests it might be as well to summarise these confidence intervals as a table: Mobile Phone Use (Hours Per Day) 0 1 2 3 4 5 1 2 3 4 5 Dr. Andy Field Sig. < < < < < < < < < .001 .001 .001 .001 .001 .001 .001 .001 .001 Page 7 95% Confidence Interval Lower Upper Bound Bound -.6982 -.2964 -1.5916 -.8960 -3.5450 -2.4631 -5.3622 -4.3783 -5.2653 -4.1608 -1.1327 -.3603 -3.0710 -1.9424 -4.8909 -3.8549 -4.7908 -3.6406 8/13/2003 Discovering Statistics Using SPSS: Chapter 8 2 3 4 Dr. Andy Field 3 4 5 4 5 5 < < < < < = .001 .001 .001 .001 .001 .984 Page 8 -2.3762 -4.2017 -4.0949 -2.5607 -2.4429 -.5455 -1.1443 -3.0512 -2.8436 -1.1717 -.9751 .8599 8/13/2003 ...
View Full Document

This note was uploaded on 05/02/2010 for the course IE 0ap06 taught by Professor Ennart during the Spring '10 term at Technische Universiteit Eindhoven.

Ask a homework question - tutors are online