This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Discovering Statistics Using SPSS: Chapter 4 Chapter 4: Answers
Task 1
A student was interested in whether there was a positive relationship between the time spent
doing an essay and the mark received. He got 45 of his friends and timed how ling they spent
writing an essay (hours) and the percentage they got in the essay (essay). He also
translated these grades into their degree classifications (grade): first, upper second, lower
second and third class). Using the data are in the file EssayMarks.sav find out what the
relationship was between the time spent doing an essay and the eventual mark in terms of
percentage and degree class (draw a scatterplot too!).
We’re interested in looking at the relationship between hours spent on an essay and the grade
obtained. We could simply do a scatterplot of hours spent on the essay (xaxis) and essay
mark (yaxis). I’ve also chosen to highlight the degree classification grades using different
symbols (just place the variable grades in the style box). The resulting scatterplot should
look like this: W 8 0.0 0 Linear Regres sion
Grade Essay Mark (%) W 7 0.0 0 W W X
X
X 6 0.0 0 W
W W X X X X
X X XX Lower Second Class
Third Clas s Essay Mark (%) = 57.93 + 0.66 * hours
RSquare = 0.07 X X X
X
X b Firs t Class
Upper Second Class X
X X X
X W W W W
X XX b 5 0.0 0
b 5 .00 1 0.0 0 1 5.0 0 Hours Spent on Essay Next, we should check whether the data are parametric using the explore menu (see chapter
3). The resulting table is as follows: Dr. Andy Field Page 1 4/21/2003 Discovering Statistics Using SPSS: Chapter 4 Tests of Normality
a Essay Mark (%)
Hours Spent on Essay KolmogorovSmirnov
Statistic
df
Sig.
.111
45
.200*
.091
45
.200* ShapiroWilk
Statistic
df
.977
45
.981
45 Sig.
.493
.662 *. This is a lower bound of the true significance.
a. Lilliefors Significance Correction The KS and ShapiroWilk statistics are both nonsignificant (Sig. is > .05 in all cases) for both
variables which indicates that they are normally distributed. As such we can use Pearson’s
correlation coefficient. The result of which is:
Correlations Essay Mark (%) Hours Spent on Essay Pearson Correlation
Sig. (1tailed)
N
Pearson Correlation
Sig. (1tailed)
N Essay
Mark (%)
1
.
45
.267*
.038
45 Hours Spent
on Essay
.267*
.038
45
1
.
45 *. Correlation is significant at the 0.05 level (1tailed). I chose a 1tailed test because a specific prediction was made: there would be a positive
relationship, that is, the more time you spend on your essay, the better mark you’ll get. This
hypothesis is supported because Pearson’s r = .27 (a medium effect size), p < .05, is
significant.
The second part of the question asks us to do the same analysis but when the percentages are
recoded into degree classifications. The degree classifications are ordinal data (not interval):
they are ordered categories, so we shouldn’t use Pearson’s test statistic, but Spearman’s and
Kendall’s ones instead:
Correlations Kendall's tau_b Hours Spent on Essay Grade Spearman's rho Hours Spent on Essay Grade Dr. Andy Field Correlation Coefficient
Sig. (1tailed)
N
Correlation Coefficient
Sig. (1tailed)
N
Correlation Coefficient
Sig. (1tailed)
N
Correlation Coefficient
Sig. (1tailed)
N Page 2 Hours Spent
on Essay
1.000
.
45
.158
.089
45
1.000
.
45
.193
.102
45 Grade
.158
.089
45
1.000
.
45
.193
.102
45
1.000
.
45 4/21/2003 Discovering Statistics Using SPSS: Chapter 4
In both cases the correlation is nonsignificant. There was no significant relationship between
degree grade classification for an essay and the time spent doing it, ρ = –.19, ns, and τ = –
.16, ns. Note that the direction of the relationship has reversed. This has happened because
the essay marks were recoded as 1 (first), 2 (upper second), 3 (lower second),and 4 (third)
and so high grades were represented by low numbers!
This illustrates one of the benefits of NOT taking continuous data (like percentages) and
transforming them into categorical data: when you do, you lose information and often
statistical power! Task 2
Using the ChickFlick.sav data from Chapter 3, is there a relationship between gender and
arousal? Using the same data, is there a relationship between the film watched and arousal?
Now, both gender and the film watched are categorical variables with two categories.
Therefore, we need to look at this relationship using a pointbiserial correlation. The resulting
tables are as follows:
Correlations
Gender
Gender Arousal Pearson Correlation
Sig. (2tailed)
N
Pearson Correlation
Sig. (2tailed)
N 1
.
40
.180
.266
40 Arousal
.180
.266
40
1
.
40 Correlations
Film
Film Arousal Pearson Correlation
Sig. (2tailed)
N
Pearson Correlation
Sig. (2tailed)
N 1
.
40
.638**
.000
40 Arousal
.638**
.000
40
1
.
40 **. Correlation is significant at the 0.01 level
(2 il d) In both cases I used a 2tailed test because no prediction was made. As you can see there was
no significant relationship between gender and arousal, rpb = –.18, ns. However, there was a
significant relationship between the film watched and arousal, rpb = –.64, p < .001. Looking at
how the groups were coded, you should see that Bridget Jones’ Diary had a code of 1, and
Momento had a code of 2, therefore, this result reflects the fact that as film goes up (changes
from 1 to 2) arousal goes up. Put another way, as the film changes from Bridget Jones’ Diary
to Momento, arousal increases. So, Momento gave rise to the greater arousal levels. Dr. Andy Field Page 3 4/21/2003 ...
View
Full
Document
 Spring '10
 ennart

Click to edit the document details