This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Statistical Methods I (EXST 7005) Page 105 Numerical example
Compare the ovarian weight of 14 fish, 7 randomly assigned to receive injections of gonadotropin
(treatment group) and 7 assigned to receive a saline solution injection (control group). Both
groups are treated identically except for the gonadotropin treatment. Ovarian weights are to
be compared for equality one week after treatment. During the experiment two fish were lost
due to causes not related to the treatment, so the experiment became unbalanced. Raw data
Obs
1
2
3
4
5
6
7 Treatment
134
146
104
119
124
*
* Control
70
85
94
83
97
77
80 Summary statistics
Statistic
n Treatment
5 Control
7 ΣYi 627 586 ΣYi 2 79,625 49,588 Y 125.4
999
4
249.8 83.7
531
6
88.6 SS
γ
S2 Research question: Does the gonadotropin treatment affect the ovarian weight? (Note: this implies
a nondirectional alternative). First, which of the various situations for twosample ttests do
we have? Obviously, n1 ≠ n2. Now check the variances.
1) 2
H0: σ12 = σ2 2) H1: σ12 ≠ σ 22 3) Assume Yi ~ NIDrv, representing the usual assumptions of normality and independence.
4) α = 0.05 and the critical value for 4, 6 d.f. is Fα/2,4,6 = 6.23.
5) We have the samples, and know that the variances are 249.8 and 88.6, and the d.f. are 4 and
6 respectively. The calculated value is (given that we have a nondirectional alternative and
arbitrarily placing the largest variance in the numerator), F = 249.8/88.6 = 2.82 with 4, 6
d.f.
6) The critical value is larger than the calculated value. We therefore fail to reject the null
hypothesis.
7) We can conclude that the two samples have sufficiently similar variances for pooling.
James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 106 Pooling the variances.
2
Recall, S p = 2
Sp = γ 1S12 + γ 2 S22 SS1 + SS2
=
γ1 + γ 2
γ1 + γ 2 4 ( 249.8 ) + 6 ( 88.6 ) 999 + 531 1530
=
=
= 153 with 10 d.f.
4+6
4+6
10 Now calculate the standard error for the test, Sd , using the pooled variance.
For this case 1⎞
⎛1 1⎞
2⎛ 1
Sd = SY1 −Y2 = S p ⎜ + ⎟ = 153 ⎜ + ⎟ = 153 ( 0.343) = 52.457 = 7.24 , with 10 df
⎝5 7⎠
⎝ n1 n2 ⎠
Completing the twosample ttest.
1) H 0 : μ1 − μ 2 = δ . In this case we could state the null as H 0 : μ1 = μ 2 since δ = 0.
2) H 0 : μ1 − μ 2 ≠ δ or H 0 : μ1 ≠ μ 2
3) Assume di ~ NIDr.v. (δ, σ δ2 ). NOTE we have pooled the variances, so obviously we have
assumed that all variance is homogeneous and equal to σ δ2 .
4) α = 0.05 and the critical value is 2.228 (a nondirectional alternative for α=0.05 and 10 df)
5) We have the samples and know that the means are 125.4 and 83.7. The calculated t value is:
t= (Y1 − Y2 ) − ( μ1 − μ 2 )
1 ⎞
2⎛ 1
Sp ⎜ + ⎟
⎝ n1 n2 ⎠ = (Y1 − Y2 ) − 0 Y1 − Y2 125.4 – 83.7 41.7
=
=
=
= 5.76 with 10 d.f.
Sd
Sd
7.24
7.24 6) The calculated value (5.76) clearly exceeds the critical value (2.228) value, so we would
reject the null hypothesis.
7) Conclude that the gonadotropin treatment does affect the gonad weight of the fish. We can
further state that the treatment increases the weight of gonads. How about a confidence interval? Could we use a confidence interval here? You betcha!
Confidence interval for the difference between means
The general formula for a twotailed confidence interval for normally distributed
parameters is: “Some parameter estimate ± tα/2 * standard error”
The difference between the means ( δ = ( μ1 − μ2 ) ) is another parameter for which we may
wish to calculate a confidence interval. For the estimate of the difference between μ1
and μ 2 we have already determined that for α=0.05 we have tα/2 = 2.228 with 10 d.f.. ( ) We also found the estimate of the difference d = (Y1 − Y2 ) is 41.7 and the std error of ( ) the difference, S d = SY1 −Y2 , is 7.24. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 107 The confidence interval is then d ± ta SY or 41.7± 2.228(7.24) and 41.7 ± 16.13. The
2 probability statement is P(d − ta Sd ≤ μ1 − μ2 ≤ d + ta Sd ) = 1 − α
2 2 P(25.57 ≤ μ1 − μ2 ≤ 57.83) = 0.95
Note that the interval does not contain zero. This observation is equivalent to doing a test
of hypothesis against zero. Some statistical software calculates intervals instead of
doing hypothesis tests. This works for hypothesis tests against zero and is
advantageous if the hypothesized value of δ is something other than zero. When
software automatically tests for differences it almost always test for differences from
zero. Summary
Testing for differences between two means can be done with the twosample ttest or two sample
Z test if variances are known.
For two independently sampled populations the variance will be 2
S12 S 2
+
, the variance of a linear
n1 n2 combination of the means.
The problem is the d.f. for this expression are not known.
Degrees of freedom are known if the variances can be pooled, so we start our twosample ttest
with an Ftest.
Variances are pooled, if not significantly different, by calculating a weighted mean.
2
Sp = γ 1S12 + γ 2 S22 SS1 + SS2
SS1 + SS2
=
=
(n1 − 1) + (n2 − 1)
γ1 + γ 2
γ1 + γ 2 2
The error variance is given by S p 1 1
(n + n )
1 The standard error is 2
Sp 2 1
( n + n1 )
1 2 If the variances cannot be pooled, the twosample ttest can still be done, and degrees of freedom
are approximated with Satterthwaite’s approximation.
Once the standard error is calculated, the test proceeds as any other ttest.
Confidence intervals can also be calculated in lieu of doing the ttest. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 108 SAS example 4 – PROC TTEST
We would normally do twosample ttests with the SAS procedure called PROC TTEST. This
procedure has the structure
proc ttest data = dataset name;
class group variable;
var variable of interest; The PROC statement functions like any other proc statement.
The VARIABLE or VAR statement works the same as in other procedures we have seen.
The CLASS statement is new. It specifies the variable that will allow SAS to distinguish
between observations from the two groups to be tested. PROC TTEST Example 4a
Example from Steele & Torrie (1980) Table 5.2.
Corn silage was fed to sheep and steers. The objective was to determine if the percent digestibility
differed for the two types of animals. Example 1: Raw data
Obs
1
2
3
4
5
6
7 Sheep
57.8
56.2
61.9
54.4
53.6
56.4
53.2 Steers
64.2
58.7
63.1
62.5
59.8
59.2 Unfortunately this data is not structured properly for PROC TTEST. It has two variables
(sheep and steers) giving the percent digestibility for sheep and steers separately.
We need one variable with percent digestibility for both and a second variable specifying the
type of animal.
This can be fixed in the data step.
In program note the following; Change of the data structure from multivarite to univariate style.
The proc ttest statement
Note intermediate statistics, especially
the confidence intervals for both means and standard deviations
The test the hypothesis for both means and variances are discussed below. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 109 Interpreting the SAS Output
First examine the last lines
Equality of Variances
Variable
Method
percent
Folded F Num DF
6 Den DF
5 F Value
1.70 Pr > F
0.5764 SAS is testing the Equality of Variances ( H0: σ1 = σ2 ). Notice that SAS provides a “folded F”.
Most SAS F tests are onetailed, but this is one of the few places that SAS does a twotailed F
test (a “folded F”). SAS gives the d.f. and the probability of a greater F by random chance.
We would usually set α = 0.05, and would reject any Pvalue less than this and fail to reject
any value greater than this. In this case we fail to reject.
2 2 Exactly what did SAS do with the “folded F”. Recall the twotailed F allows you to place the
larger F in the numerator, but you must use α/2 as a critical value. This is what SAS has
done. When SAS gave the P value of 0.5764, it is a two tailed P value.
So we conclude that the variances do not differ. If doing the test by hand we would now pool the
variances to calculate the standard error.
NOW, look at the PROC TTEST output, above the F test. ttests
Here SAS provides results for both types of test, one calculated using equal variances and another
done with unequal variances and the user chooses which is appropriate for their case. Since
we had equal variances according to the F test we just examined, we would use the first line.
Variable
percent
percent Method
Pooled
Satterthwaite Variances
Equal
Unequal DF
11
10.9 t Value
3.34
3.42 Pr > t
0.0065
0.0058 From the first line we see that the calculated t value was 3.3442 with 11 d.f. The probability of
getting a greater value by random chance (i. e. the H0) is 0.0065, not very likely. We would
conclude that there are statistically significant differences between the two animals in terms of
silage digestibility.
What about the other line, for unequal variances?
Variable
percent
percent Method
Pooled
Satterthwaite Variances
Equal
Unequal DF
11
10.9 t Value
3.34
3.42 Pr > t
0.0065
0.0058 This line would be used if we rejected the F test of equal variances. In this particular case the
conclusion would be the same since we would also reject H0. Notice that the d.f. for the
calculations for unequal variance are not integer. This is because Satterthwaite's
approximation was used to estimate the variances. Since the variances were actually “equal”,
the estimate is close to (n1–1) + (n2–1) = 11.
From the SAS STATISTICS output we can conclude that the digestibility is higher for the steers, by
about 5 percent. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 110 Example 4b: from Steele & Torrie (1980) Table 5.6
Determine if there is a difference in the percent fine gravel found in surface soils. The data is from
a study comparing characteristics of soil categorized as “good” or “poor”.
The raw data Good
5.9
3.8
6.5
18.3
18.2
16.1
7.6 Poor
7.6
0.4
1.1
3.2
6.5
4.1
4.7 Percent fine sand in good and poor soils
This data is also in the form of two separate variables and must be adjusted to accommodate the
data structure needed by PROC TTEST.
In program note the following; Change of the data structure from multivarite to univariate style.
The proc ttest statement
Note intermediate statistics, especially
the confidence intervals for both means and standard deviations
The test the hypothesis for both means and variances are discussed below.
In this case the variances are not quite different, though it is a close call and there is a pretty good
chance of Type II error. Fortunately, the result is the same with either test.
If we go strictly by the “α = 0.05” decision rule that we usually use, we would fail to reject the
hypothesis of equal variances.
We would then examine the line for equal variances and conclude that there was indeed a
difference between the good and poor quality soil in terms of the fine sand present.
The intermediate statistics show that the good soil had about 7 percent more fine sand.
Statistics
Variable soilqual
percent
good
percent
poor N
7
7 Lower CL
Mean
5.0559
1.5048 Mean
10.914
3.9429 Upper CL Lower CL
Mean
Std Dev
16.773
4.0819
6.3809
1.6987 Std Dev
6.3344
2.6362 Example 4c: Steele & Torrie (1980) Exercise 5.5.6
The weights in grams of 10 male and 10 female juvenile ringnecked pheasants trapped in January
in Wisconsin are given. Test the H0 that males were 350 grams heavier than females.
In this case the data is in the form needed, one variable for weight and one for sex. Raw data James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 111
Sex
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male Sex
Weight
Female
1061
Female
1065
Female
1092
Female
1017
Female
1021
Female
1138
Female
1143
Female
1094
Female
1270
Female
1028 Weight
1293
1380
1614
1497
1340
1643
1466
1627
1383
1711 There was, however, one little problem with this analysis. The hypothesis requested was not
simply H0: μmale = μfemale, it was H0: μmale = μfemale + 350, or H0: μmale μfemale =350. SAS
does not have provisions to specify an alternative other than zero, but if we subtract 350 from
the males, we could then test for equality. We know from our discussion of transformations
that the variances will be unaffected.
So we create a new variable called adjwt for “adjusted weight”. See the calculations in the SAS
program.
...
8
9
10
11
12
13
... Female
Female
Female
Male
Male
Male 1094
1270
1028
1293
1380
1614 1094
1270
1028
943
1030
1264 See SAS OUTPUT Appendix 4c
Note intermediate statistics
Note test the hypothesis for both means and variances.
Note that in the PROC TTEST there is another calculation in the statistics. This is the “Diff”
which also gets its calculated value and confidence interval. This difference is not a paired
difference.
Statistics
Variable sex
AdjWT
Female
AdjWT
Male
AdjWT
Diff (12) N
10
10 Lower CL
Mean
1038.1
1041
162 Mean
1092.9
1145.4
52.5 Upper CL
Mean
1147.7
1249.8
56.989 Interpretation of the SAS output
First, we fail to reject H0: σ1 = σ2 again (barely). But the weights do not differ either way
(examining Pr > t). So we fail to reject H0: μ1 = μ2, but remember we added 350 to the
males. So actually we conclude that the males are greater by an amount not different from
350 grams.
2 2 James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 112 A special case – the paired ttest
One last case. In some circumstances the observations are not separate and distinct in the two
samples. Sometimes they can be paired. This can be good, adding power to the design. For example:
We want to test toothpaste. We may pair on the basis of twins, or siblings in assigning the
toothpaste treatments.
We want to compare deodorants or hand lotions. We assign one arm or hand to one brand an
the other to another brand.
In may drug and pharmaceutical studies done on rats or rabbits the treatments are paired on
litter mates.
So, how does this pairing affect our analysis? The analysis is done by subtracting one category
of the pair from the other category of the pair. In this way the pair values become
difference values.
As a result, the “twosample ttest” of pairs becomes a onesample ttest.
So, in many ways the paired ttest is easier.
Example: We already did an example of this type of analysis. Recall the Lucerne flowers whose
seeds we compared for flowers at the top and bottom of the plant. This was paired and we
took differences. The difference was “1” with a standard error of “0.5055”. SAS example 2c examined previously
Tests for Location: Mu0=0
Test
Statisticp ValueStudent's t
t 1.978141
Pr > t
0.0793
Sign
M
2
Pr >= M
0.3438
Signed Rank
S
19.5
Pr >= S
0.0469 So the paired ttest is an alternative analysis for certain data structures. It is better because it
eliminates the “between pair” variation and compares the treatments “within pairs”. This
reduces variance.
However, note that the degrees of freedom are also cut in half. If the basis for pairing is not good,
the variance is not reduced, but degrees of freedom are lost. Summary
The SAS PROC TTEST provides all of the tests needed for twosample ttests. It provides the test
of variance we need to start with, and it provides two alternative calculations, one for equal
variance and one for unequal variance. We choose the appropriate case.
We also saw that several previous calculations, such as confidence intervals and sample size, are
also feasible for the twosample ttest case.
Paired ttest, where there is a good strong basis for pairing observations, can gain power by
reducing between pair variation. However, if the basis for pairing is not good, we lose
degrees of freedom and power. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 113 Calculating a needed sample size
The Ztest and ttest use a similar formula. Z = Y − μ0 σ2 = Y − μ0 σ n √n Let’s suppose we know everything in the formula except n. Do we really? Maybe not, but we can
get some pretty good estimates.
Call the numerator ( Y − 0 ) a difference, d . It is some mean difference we want to be able to
detect, so d = Y − μ0
The value σ2 is a variance, the variance of the data that we will be sampling. We need this
variance, or an estimate, S2.
So we alter the formula to read. Z = d σ 2 = σ d n √n What other values do we know? Do we know Z? No, but we know what Z we need to obtain
significance. If we are doing a 2tailed test, and we set α = 0.05, then Z will be 1.96.
Any calculated value larger will be “more significant”, any value smaller will not be significant.
So, if we want to detect significance at the 5% level, we can state that ...
We will get a significant difference if Z = d σ2
n = σ d ≥ Zα √n 2 We square both sides and solve for n. Then we will also SHOULD get a significant difference if
2
Zα 2σ 2
2
. Then, if we know the values of d ,σ and Z , we can solve the formula for n. If
n≥
2
d
we are going to use a Z distribution we should have a known value of the variance (σ2). If the
variance is calculated from the sample, use the t distribution. This would give us the sample
size needed to obtain “significance”, in accordance with whatever Z value is chosen. Generic Example
Try an example where d =2 σ = 5, σ2 = 25
Z = 1.96 So what value of n would detect this difference with this variance and produce a value of Z
equal to 1.96 (or greater)?
n≥ 2
Zα 2σ 2 d 2 = (1.962 * 52)/22 = 3.8416(25)/4 = 24.01 since n ≥ 24.01, round up to 25.
James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 114 Answer, n ≥ 25 would produce significant results. Guaranteed? Wouldn't this always produce
significant results? Theoretically, within the limits of statistical probability of error, yes,
but only if the difference was really 2. If the null hypothesis (no difference, μ = μ0) was
really true and we took larger samples, then we would get a better estimate of 0, and may
never show significance. Considering Type II Error
The formula we have seen contains only Zα/2 or tα/2, depending on whether we have σ2 or S2.
However, a fuller version can contain consideration of the probability of Type II error (b).
We can often use Z when working with very large samples.
Remember that to work with TYPE II or β error we need to know the mean of the real distribution.
However, in calculating sample size we have a difference, d = Y − μ0 . So we can include
consideration of type II error and power in calculating the sample size. The consideration of
β error would be done by adding another Z or t for the error rate. Notice that below I switch
(t + t ) 2 S 2
to t distributions and use n ≥ α 2 2β
.
d Other examples
We have done a number of tests, some yielding significant results and others not. If a test yields
significant results (showing a significant difference between the observed and hypothesized
values), then we don't need to examine sample size because the sample was big enough.
However, some utility may be made of this information if we FAIL to reject the null
hypothesis.
Note: Some textbooks give only the formula I originally gave for Z, without the β error
(t + t ) 2 S 2
consideration. What is the power if you use the formula omitting tβ from n ≥ α 2 2β
?
d
If you set tβ equal to zero the power is 0.50 and there is a 50% chance of making a Type II
error. An example with t values and β error included
Recall the Rhesus monkey experiment. We hypothesized no effect of a drug, and with a
sample size of 10 were unable to reject the null hypothesis. However, we did observe a
difference of +0.8 change in blood pressure after administering the drug. What if this
change was real? What if we made a Type II error? How large a sample would we need to
test for a difference of 0.8 if we also wanted 90% power?
So we want to know how large a sample we would need to get significance at the α=0.05 level
if power was 0.90. In this case β = 0.10. To do this calculation we need a two tailed α and
a one tailed β (we know that the observed change is +0.8). We will estimate the variance
from the sample so we will use the t distribution. However, since we don't know the
sample size, we don't know the degrees of freedom! Since we do not know the d.f. we will
start off with some “reasonable” values for tα and tβ. Then after we solve the equation we
will have an estimate of the d.f. We can solve again with better values of tα and tβ, and
refine our estimate. After our second calculation we have even better estimates of d.f., so James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 115 we get new values for tα and tβ and redo the calculations, etc, etc, until the estimate
stabilizes.
So we will approximate to start with. Given the information,
α = 0.05, so the value of t will be approximately 2 β = 0.10, so the value of t will be roughly 1.3 d = Y − μ0 = 0.8 from our previous results,
S2 = 9.0667 from our previous results.
n≥ (tα 2 + tβ ) 2 S 2
d 2 We do the calculations. n ≥ (2 + 1.3) 2 9.0667 (3.3)2 9.0667
=
= 154.27
(0.8) 2
0.64 And now we have an estimate of n and the degrees of freedom, n = 155 and d.f. =154. We can
refine our values for tα/2 and tb.
for d.f. = 154, tα/2 = 1.97 approx.
for d.f. = 154, tb = 1.287 approx.
So we redo the calculations with improved estimates. n≥ (1.97 + 1.287) 2 9.0667 (3.257) 2 9.0667
=
= 150.28
(0.8) 2
0.64 A little improvement! If we saw much change in the estimate of n, we could recalculate as
often as necessary. Usually 3 or 4 recalculations are enough. Summary
We developed a formula for calculating sample sizen that can be adapted for either t or Z
(t + t ) 2 S 2
distributions, n ≥ α 2 2β
d
We learned that we need input values of α, β, S2 (or σ2 for Z tests) and a value for the size of the
difference to be detected ( d ).
For the ttest, the first calculation was only approximate since we didn't know the degrees of
freedom. However, after the initial calculation the estimate could be improved by the iterative
recalculation of the estimate of n until it was stable. James P. Geaghan Copyright 2010 ...
View
Full
Document
This note was uploaded on 12/29/2011 for the course EXST 7005 taught by Professor Geaghan,j during the Fall '08 term at LSU.
 Fall '08
 Geaghan,J

Click to edit the document details