Unformatted text preview: Chapter 15 Chapter
Nonparametric Statistical Tests Parametric statistics Parametric Estimation and testing are based on Estimation
population parameters. Parametric stats are what you have Parametric learned about. Estimation: mean, variance, standard Estimation: deviation. (estimates of parameters of the parameters distribution) distribution) Tests: t tests, F tests, etc. (assume Tests: normal distribution) normal Parametric tests Parametric Parametric tests are developed with Parametric
certain assumptions. 1. A common underlying assumption is 1. common random sampling random 2. Furthermore, sampling is assumed to 2. Furthermore, draw from normally distributed population population 3. Sampling variances do not differ Sampling significantly. significantly Nonparametric tests Nonparametric Practically speaking, nonparametric Practically
statistics are what we use when we have unusual data that has Nonnormal distribution Very small sample size Problem with measurement, for example, Problem scale measure is ordinal instead of interval or ratio. or Types of tests Types Parametric tests: use sample statistics to use
make inferences about population parameters parameters Nonparametric tests: hypotheses do not hypotheses
state relationships about population parameters parameters Nonparametric tests Nonparametric Make fewer assumptions about population Make
distribution than do parametric tests distribution Sometimes called distribution free tests Sometimes distribution Why not parametric? Why Data may be skewed Ex. Reaction time May have ordinal or nominal data Must be able to calculate mean to use Must parametric tests parametric Two types of nonparametric tests tests Contingency table (chisquare) Rank tests BetweenSubject Design BetweenSubject
Scale of Scale Measure Measure Parametric NonParametric Ratio or Interval Ordinal Two Levels > Two Levels ttest or ANOVA MannWhitney U ANOVA Nominal KruskalWallis ANOVA by ranks ChiSquare ChiSquare WithinSubject Design WithinSubject
Scale of Scale Measure Measure Parametric NonParametric Ratio or Interval Ordinal Two Levels > Two Levels ttest or ANOVA Wilcoxan Signed Signed ranks ranks ANOVA Friedman ANOVA ANOVA by ranks by Nominal A common nonparametric test common If you have qualitative data (which means If
your variables are based on words or nominal scales) or if assumptions of the other parametric tests are violated, we turn to nonparametric statistic tests. turn χ 2 is one of the most frequently used
nonparametric statistics. nonparametric Chisquare tests Chisquare The purpose of χ 2 is to evaluate if the The “pattern” created by categorical data is typical. typical. Two common applications of χ 2 test are: Test whether the data fit a particular Test distribution. distribution. Test a H0 about whether one variable is related to the other. related ChiSquare Test ChiSquare Analysis of frequency data Betweensubjects design Betweensubjects (not withinsubject) (not Scores are nominal – frequency of Scores event occurring or not event Example 1: Do Data have certain pattern? certain We observe cigarette We
smoking of a random sample of 863 men 4050 years old. years
Observed freq A0=None A1=one pack A2=Two pack A3=Three pack A4=Four or more Total 406 164 189 78 26 863 Relative freq A0 A1 A2 A3 A4 Total 0.43 0.17 0.24 0.10 0.06 1.00 We have the following We
frequency table, and also frequency the relative frequency on relative cigarette smoking for men at that age group 10 years ago. ago. Question Question Are the “observed numbers” of men by Are
“cigarette use category” (A1A4) consistent with those “expected” based on the frequencies from ten years ago? the Calculate age group expecteds based on 863 total subjects based
A0 A1 A2 A3 A4 Total Relative freq 0.43 0.17 0.24 0.10 0.06 1 Calculation 863* 0.43 863* 0.17 863* 0.24 863* 0.10 863* 0.06 863 Expected 371.09 146.71 207.12 86.3 51.78 863 Example 1 To see if the cigarette To
use habits of American men in the 4050 age group are the same now as they were 10 years ago, we can compare the difference between the obtained vs. expected frequency.
Observed A0 A1 A2 A3 A4 406 164 189 78 26 863 Expected 371.09 146.71 207.12 86.3 51.78 863 Observed vs. Expected Frequency Observed O is Observed (or Obtained) frequency Observed E is Expected frequency, with Ej =N× Pj . Expected Compare Os to Es. Statistical question is whether the Statistical
differences are likely or unlikely to be due to chance. to Chisquare test statistic Chisquare ∑
j (O j − E j ) Ej 2 where sum is over each category. Each squared difference is weighted by the Each
inverse of expected frequency, Ej. Distribution under the null hypothesis of no difference hypothesis
If Os and Es are all “big” (>5) the Chisquare If and are statistic has a χ 2 distribution with J1 df (J=5 in the above example). in Test of smoking patterns Test Calculate χ2obs Calculate Determine df (J1 where J is number of Determine
categories) categories) Choose significance level (e.g., α=.05) Choose =.05) Look up χ 2crit Table C.7, p. 504. Look crit Table If χ2obs > χ 2crit then reject the hypothesis If
that recent smoking pattern is the same as previous. previous. If χ2obs < χ 2crit then don’t reject the If Calculations Calculations
A0 A1 A2 A3 A4 Total Observed 406 164 189 78 26 863 Expected 371.09 146.71 207.12 86.3 51.78 863 (OE)2/E 3.28 2.04 1.59 0.80 12.84 20.54 •χ 2obs = 20.54, df = 51 = 4 •χ 2crit(4) =9.48 •Since χ 2obs > χ 2crit(4) reject H0. Therefore we conclude that smoking patterns have changed. Comparison involving two variables Comparison
Adult female TV characters by hair color and career level Hair Color
Blonde Dark Career Professional Career Level Level
Nonprofessional 36 24 48 72 Raw data Raw B1 B2 total B1 B2 total A1 36 24 60 A1 A2 48 72 120 A2 total 84 96 180 total 84 96 180 84*60/180 84*120/180 96*60/180 96*120/180 60 120 Expected values under independence independence B1 B2 total A1 84*60/180 96*60/180 60 A2 84*120/180 96*120/180 120 total 84 96 180 B1 B2 total A1 28 32 60 A2 56 64 120 total 84 96 180 Chisquare table Chisquare Also called a rowsbycolumns Also
contingency table contingency Each data point fits into one cell Row and column totals are called Row marginals marginals We use marginals to calculate expected We marginals expected cell totals, and compare to observed cell observed totals totals Expected Values Expected
Hair Color
E = row marg.*col marg. total responses Blonde Dark Marginals Career Professional O = 36 Career Level Level E = 28 NonO = 24 professional E = 32
Marginals O = 48 E = 56 O = 72 E = 64 120 84 96 N = 180 60 Chisquare test Chisquare
χ = ∑∑
2 i =1 j =1 c r (Oij − Eij ) Eij
2 2 (36 − 28) (24 − 32) χ= + + 28 32 2 2 (48 − 56) (72 − 64) + = 6.43 56 64
2 2 Hypothesis testing Hypothesis H0: The row and column variable are The
independent in the population. independent The job level of the character is independent The of the hair color of the character. of H1: The row and column variables are The
related in the population. related The job level of the character is related to the The hair color of the character. hair Hypothesis testing Hypothesis Look up critical value at alpha of 0.05 or Look
0.01 (Table C.7, p. 504). 0.01 df = (r1)(c1) If χ2obs > χ2crit then reject H0, accept H1 If If χ2obs < χ2crit then fail to reject H0, do not If do
accept H1 accept Example results Example χ2obs = 6.43 χ2crit = 3.84 (df = 1, α = 0.05) Reject H0, accept H1. There is a relationship between job level There
and hair color of a TV character. and Example 2 Example Are firstborn Are
children more creative than laterborn children? children? Creativity Top Creativity Top
test score 1/3 test 1/3 Birth Order Firstborn Laterborn 47 29 35 36 Middle 29 Middle 1/3 1/3 Bottom 24 Bottom 1/3 1/3 Hypotheses Hypotheses H0: distribution of creativity scores is the distribution the same for first and later born H1: distribution of creativity scores is distribution
different for first and later born different Expected Scores Expected
E = row marg.*col marg. total responses Birth Order Firstborn LaterRow Row born Marginal Marginal Creativity Creativity test score test O=47 E= Middle 1/3 O=29 E= Bottom 1/3 O=24 E= Column Column 100
Marginal Marginal Top 1/3 O=29 E= O=35 E= O=36 E= 100 76 64 60 200 Birth order example Birth χ2obs= 7.22 df=(31)*(21)=2 χ2crit= 5.99 Reject H0 that birth order and creativity are
independent. independent. Rank tests Rank Continuous data. Rank them. Test is a function of the ranks. Rank tests Rank Spearman correlation (test that r=0) MannWhitney U Wilcoxon Rank tests: When to use Rank Continuous (interval or ratio) data Normality assumption doesn’t hold Equal variance assumption doesn’t hold MannWhitney U test MannWhitney Use with: Twofactor betweensubjects design At least ordinal measurement (so the values At can be “ranked”) can This is the “nonparametric” version of the ttest for two independent groups. Schroeder staircase: Schroeder Can you see both directions? Two groups saw this figure and reported Two One group had no distractions The other group counted backwards by 3s The while viewing the figure while how long it took them to see the reverse. how Results Results
Group
Control 2 5 6 8 9 13 15 21 42 Experimental 4 10 11 12 14 17 85 98 ∞
This is why we can’t do a ttest! MannWhitney U analyses MannWhitney
Scores Scores in order in 2 4 5 6 8 9 10 11 12 13 14 15 17 21 42 85 98 ∞ Group Group ID ID 1 2 1 1 1 12 2 2 1 2 1 2 1 1 2 2 2 Rank 1 2 3 4 5 67 8 9 10 11 12 13 14 15 16 17 18 # Times Times A1 before A2 A2 # Times Times A2 before A1 A1 9 8888 5 4 3 3 8 4 4 4 3 2 MannWhitney U analyses MannWhitney UA1 = 9+8+8+8+8+5+4+3+3 = 56 UA2 = 8+4+4+4+3+2 = 25 Or use formula UA2 = n1n2 UA1
UA2 = 9*956 = 8156 = 25 Use smaller value in test MannWhitney U hypotheses MannWhitney H0: The population distribution of A1
scores is identical to the pop. dist. of A2 scores scores scores H1: The pop. dist. of A1 scores is not
identical to the pop. dist. of A2 scores identical MannWhitney U hypothesis testing MannWhitney Red Alert! This test is different! If Uobs < Ucrit Reject H0, accept H1 If Uobs > Ucrit (p. 505506, Tables C8.A and crit
8.B) 8.B) Fail to reject H0, do not accept H1 Ucrit=17, Uobs=25 Fail to reject H0, do not accept H1 Computational formula Computational UA1 = nA1nA2+ nA1(nA1+1)  Σ RA1 A1 A1
2 UA2 = n1n2 UA1 Where nA1=number of scores in group A1 nA2=number of scores in group A2 Σ RA1=sum of ranks assigned to scores in =sum group A1 group Example Example Will receiving an alcohol education Will
program result in reduced estimated daily alcohol consumption? alcohol A1: 0.31, 0.53, 0.58, 0.14, 0.16, 0.52, 0.53, A1: 0.02 0.02 A2: 0.41, 0.63, 1.14, 0.21, 0.89, 0.55, 0.89, A2: 0.91, 0.08, 0.59 0.91, Summary Summary Nonparametric tests are appropriate Nonparametric
when parametric (normal distribution) tests cannot or should not be used. cannot Contingency table (chisquare) tests are of Contingency the form ∑ (OE)/E where E is the “expected” number under the null hypothesis. hypothesis. Rank tests are for continuous data when Rank and are functions of the ranks. and ...
View
Full Document
 '07
 Nezami,Borovay
 OS, Nonparametric statistics, A1 A2 A3, A2 A3 A4, chisquare test chisquare

Click to edit the document details