**Unformatted text preview: **Chapter 5: Investigating the
Difference in Scores
Outline:
• Introduction
• Why Use Tests of Difference
• Dependent and Independent Variables
• Null Hypothesis
– One & Two Tailed Tests of Significance • Types of Errors
– Type I and Type II Error •
•
•
• Standard Error of Mean
Standard Error of the Difference of the Means
Assumptions When Testing for Difference
Types of t-Tests
– Independent & Dependent Groups • Analysis of Variance
– One Way ANOVA, Repeated Measures ANOVA, Post Hoc Tests • Selecting the Test Objectives:
1. Explain how you could use tests of difference.
2. Define Type I and Type II errors and how it
relates to level of significance.
3. List the assumptions that are required to test
for differences between means.
4. Understand when to use and how interpret the
t-test for independent and dependent groups.
5. Understand when to use and how to interpret
analysis of variance for independent groups
and analysis of variance for repeated measures. Introduction
• Questions?
• Is there a difference between
groups?
– males vs. females
– athletes vs. non-athletes
– treatment method of an injury ice vs. ultrasound
– flexibility programs static stretching vs. propriomuscular stretching
– interval vs. continuous training
programs Why use tests of difference?
1. Improve your understanding
and interpretation of research.
2. Allows for the evaluation of
the effects of a cause or treatment.
3. In experimental
research –
allows for development of
cause-and-effect relationships. Dependent and Independent
Variables
• Dependent variable:
• What you are measuring to
determine if it changes. • Independent variable:
• What you are doing, controlling
or manipulating that might cause
change to the dependent variable. Check your Understanding
• Identify the dependent and independent variables:
– The effect of background music on endurance
performance during cycle ergometry to fatigue.
– Direct and indirect coaching styles on high school soccer
players’ decision making skills during competition.
– The effect of ankle bracing on peak mediolateral ground
reaction force during cutting maneuvers in collegiate
male basketball players.
– Arm versus combined leg & arm exercise: Blood pressure
responses and ratings of perceived exertion at the same
indirectly determined heart rate. The Null Hypothesis
•
•
•
•
• Hypothesis:
A prediction about what will happen. A scientific hunch - expected outcome.
Null hypothesis:
no difference between groups (ie., means are equal ;
Ho: X1 = X2).
• Alternative hypothesis:
• there is a difference between groups (means are
equal ; Ha: X1 ≠ X2).
• Directional Hypothesis – the difference is expected to
occur in a certain direction (X1 > X2 or X1 < X2).
minutes of exercise and weight loss, high intensity vs. low intensity One & Two Tailed Levels of
Significance
• Tables will list:
• t value for accepting or rejecting the hypothesis
(usually pre-selected at the 0.05 or 0.01 level).
• t value – when the calculated value is > the table
value, we can say “
”.
the difference between the means is significant” - “it is real difference”.
• Degrees of Freedom (df) - determined by the
sample size (N – 1; where N = sample size).
• When using a t-test (with two independent
groups)
• df = (N1 - 1) + (N2 - 1) or N1 + N2 - 2 This Table –
Berg & Latin
(1994). Table D
– page 226.
Critical Values
of t.
Text – Appendix
B – page 276.
Critical Values
of t (Two-Tailed
Large
samples df will be
Test).
higher, t-values slightly
smaller. Less of a
difference between
means will be
statistically significant. Illustration of 0.05 Level of
Significance
Is a difference
(p < 0.05) Accept that there is no difference (p > 0.05) 0 % Ho: X1 ≠ X2 Ho: X1 = X2 100 % no difference in means • Significant – differences between the scores
are real at the alpha level identified.
– 5 chances in 100 that you are in error. • Nonsignificant – there is no real
difference between the scores. One-tailed or Two-tailed - Which
Version Should You Use?
• One-tailed - when you believe the
mean difference will occur in one
direction. – Strength – Training - pre and post training • Two-tailed – when the difference
can be in either direction.
– Strength – comparing two groups
with different strength programs
(not sure which is a better program) Take many samples from a population, means of the samples would create a normal distribution. Do Not Reject
Null Hypothesis Do Not Reject
Null Hypothesis
Reject Null
Hypothesis One Tailed Reject Null
Hypothesis Reject Null
Hypothesis
Two Tailed Types of Errors
• Type I Error –
• Reject a null hypothesis (X1 = X2) when it is true.
• Conclude that there is a difference when there
really isn’t a difference.
• Occurs more often when significance = 0.05
• Type II Error –
• Accept the null hypothesis when it is false.
• Conclude that there was no difference when
there was a real difference.
• Occurs more often when significance = 0.01 Level of Significance and Type of Error
True state in population Your
decision Null Hypothesis (H0) is
true
Alternate hypothesis
(H1) is false Null Hypothesis (H0) is
false
Alternate hypothesis
(H1) is true Reject H0
Accept H1 Type I Error
(alpha) Correct
decision Accept H0
Reject H1 Correct
decision Type II Error
(beta) • Can reduce your chance of making a Type I error by
increasing the level of significance - 0.01, 0.001.
• Best way of reducing a Type II error is to increase the
sample size. Standard Error of Mean (S.E.M.)
• Represents the standard deviation of the
sample distribution.
• Formula S.E.M. = s
N
• Where: s = standard deviation of sample; N =
number of scores.
• ± 1 SEM = range interpreted as the limits of the
68% confidence intervals for mean.
Mean = 10m SEM = 2. 68% of the time we would find a mean between 8-12 Standard Error of the Difference
between Means
•
• Represents the standard deviation of all the observed
differences between pairs of sample means.
Estimate of the expected difference between two sample
means randomly drawn from the same population. • Formula Sx1-x2 = SEM12 + SEM22 • Where: Sx1-x2 = standard difference of the means; SEM1
of sample 1, SEM2 of sample 2 Assumptions When Testing for
Differences (t-Test & ANOVA) 1. Data are drawn from normally distributed population 2. Data represent random samples from
populations.
3. Variance in each group
is similar.
variance = difference between the means Bigger difference between
means = larger t ratio t ratio = Variance between Groups
Variance within Groups spread of scores within the group Bigger variance (spread of
scores) = smaller t ratio Types of t Tests
1. Independent t Test
• Do two sample (group) means differ from each
other?
Two groups – each assessed on Leger
m Shuttle Run (“Beep” test). The
• Most often used t test. 20
predicted VO max in ml/kg/min are
2 given below.
Group 1
42
50
57
45
56
69
45
43
46
51
61
55
40 Group 2
47
51
59
43
47
43
46
37
30
63
48
53
53
44
40
44
40
38
64
44
48
37
41
51
30 44
39
34
48
50
30
37
39
47
55
49
43 Group 1 Group 2 Mean 48.92 Mean 43.72 Low 30 Low 30 High 69 High 63 SD 8.78 SD 7.83 Range 39 Range 33 Calculation Steps: Independent t Test
1. Calculate descriptive statistics.
• Group 1: Mean = 48.92, SD = 8.78, N = 25
• Group 2: Mean = 43.72, SD = 7.83, N = 25
2. Calculate the SEM for each group.
• SEM1 = s1 = 8.78 = 1.756 N1 25 • SEM2 = s2 = 7.83 = 1.566 N1 25 3.Calculate the standard error of difference
between groups.
• sx-x = SEM12 + SEM22 =
(1.756)2 + (1.566)2 = 2.353
4.Calculate the t-ratio by substituting the
values in the formula
• t = X1 - X2 = 48.92 – 43.72 = 2.213
sx-x
2.353
t – compare to Critical Value of t in Table Degrees of freedom =
(N1 - 1) + (N2 - 1) or
= N1 + N 2 - 2
df = (25 – 1) + (25 – 1) = 48
t = 2.213
Two Tailed
2.213 > 2.060 so this
difference is significant at
the 0.05 level.
The means are
different. 1. Independent t Test - Excel T test 2 Sample Assuming
Equal Variance Group 1 Group 2 Group 2
42
50
57
45
56
69
45
43
46
51
61
55
40 47
59
47
46
30
48
53
40
40
64
48
41 Group 1
51
43
43
37
63
53
44
44
38
44
37
51
30 44
39
34
48
50
30
37
39
47
55
49
43 Mean 48.92 Mean 43.72 Low 30 Low 30 High 69 High 63 SD 8.78 SD 7.83 Range 39 Range 33 Degrees of freedom =
(N1 - 1) + (N2 - 1) or
= N 1 + N2 - 2
probability of making a
mistake - 3 % chance of
being an error t-Test: Two-Sample Assuming Equal Variances Mean
Variance
Observations
Pooled Variance
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail Group 1
48.92
77.16
25
69.26833333
0
48
2.208975917
0.015991463
1.677224197
0.031982925
2.010634722 Group 2
43.72
61.37666667
25 SPSS Output
Group Statistics Group
N
VO2max 1 25 2 25 Mean
Std. Deviation
48.92
8.784
43.72 Std. Error Mean
1.757 7.834 1.567 Independent Samples Test
Levene's Test for
Equality of
Variances VO2max Equal
variances
assumed
Equal
variances
not
assumed F
.381 Sig.
.540 t-test for Equality of Means t
2.209 2.209 df
48 47.385 95% Confidence
Interval of the
Sig. (2Mean
Std. Error
Difference
tailed)
Difference Difference Lower
Upper
.032
5.200
2.354
.467
9.933 .032 5.200 2.354 .465 9.935 ex. North american football team
-comparing two individuals with similar strength 2. Dependent t Test
• Do the scores of two sets
of data, that are
, related in some way differ from each other?
• Relationship takes one of two forms: a) Two groups are matched on one or more characterist
and are thus not independent.
b) One group is tested twice on the same variable
( pretest and postest
). • is used more often with
dependent t-tests – scores are expected to
One-tailed test increase or decrease. Dependent t Test Formula
• Formula is:
• t=
( D ÷ N)
1 ÷ N [N( D2] – ( D)2
(N – 1)
• Where: D = difference in paired scores (ie.,
pretest – post test), N = number of paired
scores
• Degrees of freedom (df) = N – 1 Example: Dependent t Test
Free Throw scores on 25 attempts effect of 4 weeks free throw practice
2 Subject Pretest Posttest D D 1 19 20 -1 1 2 17 15 2 4 3 19 20 -1 1 4 14 16 -2 4 5 13 17 -4 16 6 16 16 0 0 7 16 15 1 1 8 17 18 -1 1 9 17 19 -2 4 10 14 17 -3 9 ∑ 162 173 -11 41 Example: Dependent t Test
• t= (-11 ÷ 10)
1 ÷ 10 [10(41] – (-11)2
(10 – 1)
• t=
-1.10
=
-1.1
0.10 410 – 121
0.10 289
9
9
=
-1.1
-1.1
= -1.1
= -1.941
• 0.10 32.11 0.10 x 5.667 0.5667 • Compare -1.94 to One
tailed level of
significance 0.05 level
column @ df (10 – 1) =
9.
• 1.94 > 1.833 so the
difference between
(The means are different)
the pretest and
posttest mean is
significant @ 0.05 level. • 4 wks practice
improved
free
throw shooting. 3. Dependent t Test - Excel
Pre Test
51
43
43
37
63
53
44
44
38
44
37
51
30
44
39
34
48
50
30
37
39
47
55
49
43 Post Test
42
50
57
45
56
69
45
43
46
51
61
55
40
47
59
47
46
30
48
53
40
40
64
48
41 t-Test: Paired Two Sample for Means
Pre Test
Mean
Variance
Observations
Pearson Correlation
Hypothesized Mean
Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail Degrees of freedom =
(N – 1) = 25 – 1 = 24 43.72
61.37666667
25
0.242453497
0
24
-2.535328822
0.009081853
1.710882067
0.018163705
2.063898547 Post Test
48.92
77.16
25 Analysis of Variance (One-way
ANOVA)
• Allows for evaluation of 2 or more
group
means on one dependent variable (an
extension of the independent t-test ).
• F-ratio - the calculated number associated
with ANOVA ( like critical t value for t-test ).
• Null Hypothesis: M1 = M2 = M3 Bottaro, M., Martins, B., Gentil, P., and Wagner, D. (2009). Effects of rest
duration between sets of resistance training on acute hormonal responses
in trained women. Journal of Sci. and Med. in Sport, 12: 73-78. • Methods:
• Standard statistical procedures were used to calculate means and
standard deviations (S.D.). Differences in hormonal responses
among time point for each trial were evaluated using a one-way
ANOVA with repeated measures. The resulting integrated area
under the response curve for GH (GHauc) and cortisol (Cauc) were
computed using a trapezoidal method after pre-exercise values
were subtracted from each time point. Differences among GHauc
and among Cauc rest intervals (30, 60, and 120 s) were analyzed
using a one-way ANOVA with repeated measures. Multiple
comparisons with confidence interval adjustment by the LSD
(Least Significant Difference) method were used as post hoc when
necessary. The significance level was set at p < 0.05. The SPSS 14.0
(SPSS, Chicago, IL) was used in the current analyses. ANOVA Table
• SS = sum of squares
• Calculations – each subject’s score is squared.
Sum of squared scores is used in the
calculations.
• Treatment variance (Between group variance)
• Error variance (Within group variance)
• Total variance
• MS = mean squares: SS divided by the
appropriate df to calculate MS Degrees of Freedom
• Between Groups (Treatment) df = k – 1
where: k = number of groups
• Within Groups (Error) df = N – k where: N =
total number of subjects in the study and k =
number of groups
• Total df = N – 1 where: N = total number of
subjects variance due to the treatment in the study
• F = Between Group (Treatment) Variance
•
Within Group (Error) Variance
variance due to chance or sampling error Analysis of Variance
Group 1 Group 2 Group 3
12
7
13
15
10
14
10
11
10
11
8
9
9
9
12
14
10
11
12
12
11
13
9
15 Anova: Single
Factor
SUMMARY
Groups
Group 1
Group 2
Group 3 ANOVA
Source of Variation
Between Groups
Within Groups
Total Count
8
8
8 SS
31.75
74.875
106.625 Sum Average Variance
96
12
4
76
9.5 2.571429
95 11.875
4.125 df MS
F
P-value F crit
15.875 4.452421 0.024435 3.4668 2
21 3.565476
23 3 Classes – knowledge on personal health (quiz/15).
There is a difference in scores (p < 0.05) between the
groups 1, 2, and 3. SPSS Outputs
Descriptives
Score N
1
2
3
Total 8
8
8
24 Mean
12.00
9.50
11.88
11.13 Std.
Deviatio
Std. Error
n
2.000
.707
1.604
.567
2.031
.718
2.153
.440 95% Confidence
Interval for Mean
Lower
Upper
Minimu Maximu
Bound
Bound
m
m
10.33
13.67
9
15
8.16
10.84
7
12
10.18
13.57
9
15
10.22
12.03
7
15 ANOVA
Score
Between
Groups
Within Groups Total Sum of
Squares
31.750 2 Mean
Square
15.875 74.875 21 3.565 106.625 23 df F
4.452 Sig.
.024 Group
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3 Score
12
15
10
11
9
14
12
13
7
10
11
8
9
10
12
9
13
14
10
9
12
11
11
15 Post-Hoc Tests
• Significant F-ratio = means are significantly
different but does not indicate which group is
different. • A post-hoc test is performed to determine
which group is different from the others.
Groups Count Sum Average Variance Group 1 8 96 12.0 4.00 Group 2 8 76 9.50 2.57 Group 3 8 95 11.88 4.13 • Perform similar functions as an independent
t-test. • Tests are listed from liberal to stringent.
Names of Post-Hoc Tests:
1. Duncan Multiple Range difference between means
2. Newman-Keuls
3. Fisher's Least Significant Difference (LSD)
4. Tukey's Honestly Significant Difference (HSD)
5. Scheffe' minimize the chance of making a Type I error SPSS Output Post Hoc Tests
Multiple Comparisons
Score
Tukey HSD
(I) Group (J) Group 1 dimension 2
3 3
2 dimension 1
dimension2
3 3
3 dimension 1
3 2
*. The mean difference is significant at the 0.05 level. group 1 & 2 sig. difference
group 1 & 3 Not
group 2 & 3 just not Mean
Difference (IStd. Error
J)
*
2.500
.944
.125
.944
*
-2.500
.944
-2.375
.944
-.125
.944
2.375
.944 95% Confidence Interval
Sig.
.038
.990
.038
.051
.990
.051 Lower Bound Upper Bound
.12
4.88
-2.25
2.50
-4.88
-.12
-4.75
.00
-2.50
2.25
.00
4.75 group 1 Multiple Comparisons
Score
Scheffe
(I) Group (J) Group Mean
Difference (I-J) Std. Error
1
2.500*
.944
dimens 2
.125
.944
ion3 3
2
-2.500*
.944
dimens 1
dimension2
-2.375
.944
ion3 3
3
-.125
.944
dimens 1
2.375
.944
ion3 2
*. The mean difference is significant at the 0.05 level. 95% Confidence Interval
Sig.
.049
.991
.049
.063
.991
.063 Lower Bound Upper Bound
.01
4.99
-2.36
2.61
-4.99
-.01
-4.86
.11
-2.61
2.36
-.11
4.86 Repeated Measures ANOVA
• The same subjects are tested several times
(dependent variable) to determine the effect
of the independent variable.
• For example - measure changes in a variable
with time
( pre, mid, post season).
• When two scores are assessed = dependent
t-test for two means. Selecting the Statistical Test
2 Groups 1 Group 2 or More Groups are matched Independent Independent t Test Dependent Paired t Test
2 Tests Dependent t Test or
Paired Samples Modified from Figure 14.3,
Baumgartner (2006). 1–Way ANOVA > 2 Groups - Post Hoc
Tests
> 2 Tests - ANOVA
Repeated Measures
& Post Hoc Tests ...

View
Full Document

- Winter '18
- Gary Schnidler
- Normal Distribution, Variance, Statistical hypothesis testing, Student's t-test