Unformatted text preview: Probability & the Binomial Test Outline
Probability Binomial Distribution Z-Approximation to the binomial The Sign Test Non-Parametric Statistics Non- The Binomial Distribution
Non-parametric statistics: NonNo assumptions about the shape of the population.
Nominal & ordinal data This class will only cover 2, Binomial, Sign Test - - all use nominal data. The Binomial Distribution
There is only one ZZdistribution.
This is the null hypothesis distribution for the ZZdistribution. There are INFINITE binomial distributions.
We can set up these different null hypothesis distributions using three different methods. 1 The Binomial Distribution
Assumptions for the Binomial:
Events are dichotomous. Events are mutually exclusive, independent, and randomly selected. The number of observations/trials (N) is fixed. The probability of success (p) is the same for each outcome. The Binomial Distribution
Assumptions for the Binomial:
Events are dichotomous. Events are mutually exclusive, independent, and randomly selected. The number of observations/trials (N) is fixed. The probability of success (p) is the same for each outcome. Setting up the Null Hypothesis Distribution
Continuous Distribution Defined by mean and standard deviation
Special case - - Standard Normal distribution (0,+/-1) (0,+/- Setting up the Null Hypothesis Distribution One Sample Z-test ZMean of the null = 0 Standard deviation = +/-1 +/Critical Regions with alpha = 0.05: +/-1.96 +/-1.96 0 +1.96 2 Setting up the Null Hypothesis Distribution
Discrete distribution Defined by the mean and standard deviation The Binomial Distribution
N = # of Trials p = probability (P) of a success q = probability (P) of a failure (or 1-p) 1- Mean = Np
SD = Npq The Binomial Distribution
What's the probability of What' getting a "heads" on a heads" single coin flip?
0.2 0.18 0.16 0.14 0.12 0.1 0.08 Binomial Distribution - N=20, p=0.50 p = 0.50 q = 0.50 (or 1-p) 1- 0.06 0.04 0.02 0 0 1 2 3 4 5 6 7 8 9 10 N 11 12 13 14 15 16 17 18 19 20 3 Method 1: Setting up Null for The Binomial Distribution
The number of ways a particular outcome can occur and the probability of that outcome. The Binomial Distribution
Let's say we we're to flip a fair coin 20 times. What Let' we' would the binomial distribution look like?
20! 0.500 q 20-0 0!(20 - 0)! = 0.00000095367 20! P( x = 1) = 0.501 q 20-1 1!(20 - 1)! = 0.00001907349 P ( x = 0) = N! p xq N -x = x !( N - x)!
Combinations N = number of independent trials x = number of outcomes (successes) out of N trials p = probability that the event occurs (success) q = probability that the event does not occur (~success) = N! p xq N -x x !( N - x)!
20! 0.5019 q 20-19 19!(20 - 19)! = 0.00001907349 P( x = 19) = P( x = 10) = = 0.176197 20! 0.5010 q 20-10 10!(20 - 10)! 20! 0.5020 q 20- 20 20!(20 - 20)! = 0.00000095367 P( x = 20) = The Binomial Distribution
1. The Binomial Distribution
20! 0.500 q 20 -0 P ( x = 0) = 0!(20 - 0)! = 0.00000095367 2. Density of 1 under the curve: If we were to add up all of the exact probabilities (from x=0 to x=20), then our sum would be equal to 1. Symmetry: The probability for x = 0 equals that of x=20, the probability for x = 1 equals that of x = 19, etc... etc...
ONLY because p = 0.50, if we change p then we'll get a we' skewed shape. 20! 0.501 q 20 -1 P ( x = 1) = 1!(20 - 1)! = 0.00001907349 3. Cumulative probabilities: If we want to know the probability of getting 3 or less heads out of 20 coin flips.... flips... 20! 0.502 q 20 - 2 2!(20 - 2)! = 0.00018119 P( x = 2) = P ( x 3) = P ( x = 0) + P ( x = 1) + P ( x = 2) + P ( x = 3) = 0.00000095367 + 0.00001907349 + 0.00018119 + 0.00108719 = 0.0012884 20! 0.503 q 20 -3 P( x = 3) = 3!(20 - 3)! = 0.00108719 4 The Binomial Distribution
Binomial Distribution - N=20, p=0.50
0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0 1 2 3 4 5 6 7 8 9 10 N 11 12 13 14 15 16 17 18 19 20 p The Binomial Distribution
The binomial probability for obtaining r successes in N trials: P(r ) =
Where: N! r (1 - ) N - r r !( N - r )! N = number of trials r = number of successes = probability of success on any one trial The Binomial Distribution
1. Method 2: Setting up Null for The Binomial Distribution
Using Tables 2. Density of 1 under the curve: If we were to add up all of the exact probabilities (from x=0 to x=20), then our sum would be equal to 1. Symmetry: The probability for x = 0 equals that of x=20, the probability for x = 1 equals that of x = 19, etc... etc...
ONLY because p = 0.50, if we change p then we'll get a we' different skewed distribution. 3. 4. Cumulative probabilities: If we want to know the probability of getting 3 or less heads out of 20 coin flips.... flips... This takes way too long... long... P (x < 3) = 0.0010 5 Method 2: The Binomial Distribution
Binomial tables might be point specific or cumulative. The table I gave you is NOT cumulative, so you need to add up each cell in the column if you want cumulative information. The Binomial Distribution
What does this have to do with statistics? What does this have to do with significance testing?
The Binomial Test Oscar the Death Cat?
x=8 N = 20 p = 0.50 Question Is Oscar the Death Cat? First Find on the distribution where p 0.025. We need to find the cumulative probability starting from x=0 and x=20.
So the lower end of our distribution is x <5. What does this mean? 6 The Binomial Distribution
These are the probabilities we would expect to find BY CHANCE if x=0, x=1, x=2, etc... x=20 for a N of 20. etc... This is the sampling distribution of the null hypothesis. What is our Type I error rate?
Binomial Distribution - N=20, p=0.50
0.2 0.18 0.16 The upper end of our distribution is x > 15. What does this mean? 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 p Reject Ho Reject Ho 0 1 2 3 4 5 6 7 8 9 10 N 11 12 13 14 15 16 17 18 19 20 The Binomial Distribution
Three methods to setting up the null hypothesis distribution for each Binomial Distribution
1. 2. Outline
Probability Binomial Distribution Z-Approximation to the binomial The Sign Test 3. Calculation by Hand = very time consuming! Use of Binomial Tables = dependent upon outside resources. ... 7 The Standard Normal Distribution
~68% ~95% The Z-approximation to the Binomial ZMethod 3 Assumptions:
Same assumptions apply as with the Binomial ~99% The Z-approximation to the Binomial ZWhen to use:
Independent events or trials Two possible outcomes Outcomes are mutually exclusive. Np & Nq are both > 5
Criterion for determining that a binomial sampling distribution is a sufficiently close approximation of the normal distribution. The Z-approximation to the Binomial Z- 8 The Z-approximation to the Binomial Z`Correction for Continuity" Continuity"
Aimed at transforming the binomial distribution into the smooth curve of the normal distribution.
Binomial = Discrete Distribution Z = Continuous Distribution The Z-approximation to the Binomial ZHow do you know when to add or subtract 0.50?
ADD: When x is less than the mean. SUBTRACT: When x is greater than the mean. Built-into the Z-approximation formula BuiltZUse the real-limit of x real- z= ( x - ) 0.50 OR z= ( x - Np ) 0.50 Npq Example
A multiple choice test has four possible answers for each of 20 questions. What is the probability of getting a correct answer on any given question? Answer = 0.25 - - this is "p" Example (cont.)
The conditions of the binomial are assumed to be met.
1. 2. 3. 4. There are 20 questions (N=20) and each question (i.e., trial) results in one of two possible outcomes (correct or incorrect). The probability of being correct is 0.25 and is constant (i.e., this student did not study at all). The questions are answered independently given that the student's answer to a question does not student' influence his/her answer to another question. 9 Example (cont.)
The conditions of using the Z approximation to the binomial are also met:
Binomial conditions Np and Nq are > 5
Expected Distribution: N=20, p=0.25 Example (cont.)
Billy is a student taking the exam. He did not study at all, so we can assume Billy is guessing. What is the probability that Billy passes the exam given that he is guessing? OR what is the probability that Billy gets a 13 or better on the exam (65% or better)? Example (cont.)
We know that both np(=5) and nq(=15) are both greater than or equal to 5, so we can use the Z-approximation to the binomial to figure Zout the probability of passing. = Np = 20 *0.25 = 5.00 Example (cont.)
Z= Z= ( x - ) 0.50 = Npq = 20*0.25*0.75 = 1.9365 (13 - 5) - 0.50 1.9365 Z = 3.873 10 Example (cont.)
Round Z=3.87 to Z=3.90 to fit the format of the table. Billy has a 0.005% chance of passing the exam if he guessed on all 20 questions. Example (cont.)
Don't forget... Don' forget...
We can also find this out by using the original computation method (i.e., Method 1) and a Binomial table with the expected distribution of N=20, p =0.25. These ways are just longer or dependent upon outside resources. What does this have to do with significance testing?
When =0.05 (2-tailed), we (2can compare the actual number (x) to our cut-off cutdistributions (critical values) on our null hypothesis distribution (Z-distribution). (Z- Example (cont.)
Let's take Billy, for example. Billy did not Let' prepare for the exam at all. Let's say that Billy Let' scored a 8/20. Do you suspect Billy of cheating? Ho : p =0.25 H1: p 0.25 11 Z= Z= ( x - ) 0.50 (8 - 5) 0.50 1.9365 Z = 1.291 OR compare ZOBS to ZCRIT... -1.96 0 +1.96 +1.29 Example (cont.)
Answer: We retain the null hypothesis. We cannot conclude Billy cheated, Z=1.291, p = 0.0985. Billy scored similar to chance/similar to if he guessed on every question. Rationale: Our critical Z-values are +/-1.96 Z+/respectively ( =0.05). Our observed values are ( not within the rejection region (i.e., >+1.96, p > 0.05), therefore we retain the null hypothesis. 0.05), Outline
Probability Binomial Distribution Z-Approximation to the binomial The Sign Test 12 The Sign Test
Special Case: Uses the normal approximation to the binomial.
Follows the same assumptions as the binomial. The Sign Test
When to use
Comparing Time 1 to Time 2 Code each participant's change in terms of (+) or (-) participant' (change.
Changing data to nominal data Disregarding magnitude; only keeping direction of change. Throwing away a lot of information by changing to a binomial distribution.
Will learn later that a repeated-measures t-Test is a better repeatedtapproach. The Sign Test
P(+) = 0.50 = p P(-) = 0.50 = q P(Ho: P(+) = P(-) P(Indicative of no change between times. The Sign Test
Before Drug 35 40 50 49 31 30 55 34 42 38 After Drug 30 35 50 49 28 25 59 35 43 32 Change? + + 0 0 + + + H1: P(+) P(-) P(Indicative of a change between times. 13 The Sign Test
N = 10
+ = 5 (this is x the number of "successes") successes" - =3 0=2 The Sign Test
z= ( x - Np ) 0.50 Npq
(6 - (10*0.50)) - 0.50 10*0.50*0.50 Z = 0.316 z=
Compare our ZOBS to our ZCRIT of +/- 1.96. We retain the null hypothesis. The antidepressant did not decrease the number of depressive symptoms, Z=0.316, p =0.3745. What to do with the ties?
Option #1: Drop ties Option #2: Split ties
+=6 - =4 Sign Test
This makes sense because if our null hypothesis is P(+) = P(-), we would expect a 50/50 split or P(5 pluses and 5 negatives. We found 6 pluses and 4 negatives very close to the expected value. ZOBS = 0.32 14 In Sum
Five things you need to do to work with the Binomial:
1. 2. 3. 4. 5. Define "success" (e.g., heads, rolling a 4, death) success" Define the probability of a "success" (e.g., p =0.50) success" Find the probability of a "failure" (e.g., (1-p) failure" (1Define the number of trials (e.g., N=20) Define the number of "successes" (i.e., x) out of successes" those trials (e.g., x=5) 15 ...
View Full Document
- Spring '08
- Normal Distribution, Probability theory, Binomial distribution, Billy Let