Unformatted text preview: Basics of the Course
The course is taught using powerpoint.
These lecture notes WILL change as the
term progresses. Special Slide Pictures Data Notation
Denote data by x1, x2 , x3, … xn where n
is the number of data values we have, called
the sample size.
The collection of x1, x2 , x3, … xn is called a
dataset whereas a particular value xi is
called a datum, data value or observation. Example (Spoof)
http://ca.youtube.com/watch?v=MQw12_kNAhU&feature=related Example
Description
This data set gives the average heights and weights for
American women aged 30–39.
Obs
1
2
3
4
5 height weight
58
115
59
117
60
120
61
123
62
126 Data Types NOTE: A quantitative variable can be made qualitative…we’ll see
in a second… Examples with Clickers
Height
2) Grades
Clicker Responses:
A) Qualitative
B) Categorical
C) Quantitative
D) Both A and B
1) Quantitative Data Examples with Clickers
1) Height
2) Number of Cats Owned by Canadians
Clicker Responses:
A) Discrete
B) Continuous
C) Both
D) Niether Qualitative Data Examples with Clickers
Body Size: Skinny, Normal, Obese
2) Type of Stone: Granite
Clicker Responses:
A) Discrete
B) Continuous
C) Both
D) Neither
1) Analysis
Raw data is hard to analyse. For example,
consider the Ph values below. Remember a
Ph of 7 is neutral, a Ph <7 acidic and a Ph >
7 basic.
Are the 100 lakes below acidic, basic or
neutral? Data
5 6867887868677557
98 7 6 8 8 8 5 6 8 8 5 7 6 7 8 8
4866666788577578
7866768868877876
10 6 8 6 6 8 6 6 7 8 7 6 8 7 6 7
7765787776687777
87 Well??? Dataset Characteristics
3 Characteristics
1.
2.
3. Analysis
Two Techniques:
1.
2. Example Dataset
Consider the following data:
1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7
We can build a display simply by ticking off
every time we see a number. 1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7 Center
Rough Definition – The middle of the data
Pictorially  Spread
Rough Definition  How separated our data
values are. Shape
The appearance of the data. Shape
The shape of a dataset can be determined
numerically using measures such as
Kurtosis and Skew – but we will not
investigate these statistics in this course. Center
There are 3 measures of center:
A)
B)
C) Mode
The most popular value. Also the most useless statistic.
e.g.
1,1,1,2,20 Mean
You would call it an “average”.
Notation:
Data:
Mean: Example
Consider the data:
1,
1,
The mean is: 1, 2, 20 Median
The middle value of the data.
Notation (Median): Notation (Sorted Data): Algorithm: Median
Given the data: x1, x2 , x3, … xn.
1.
Sort the data from smallest to largest. 2. If n is odd, then take the middle value.
3. Else if n is even, take the average of the
middle 2 values. Example
1. Sort
2. n odd = middle
n even = average 1,1,20,2,1 Example
1,1,20,3,1,4,12,2 What is the median?
A) 1
B) 1.5
C) 2
D) 2.5
E) 3 Example
1. Sort
1, 1, 1, 2, 3, 4,12, 20
2. n odd = middle
n=8
n even = average Q2 = Outliers
Outliers are values that are more extreme
than the others.
For example: 1, 2, 3, 4, 1000
For example: 0.8, 11, 0.1, 0.6, 1, 0.3, 0.9 Summary: 1,1,1,2,20
Mode 1 Mean 5 Median 1 Question
Why is the mean different from the median
and mode? Order Statistics
The median is called the “second quartile”.
This implies there are “other” quartiles.
A quartile derives it's name from quarter and
each quartile divides the data into quarters. Pictorially In Words
25% of our data is below Q1, the first quartile.
50% of our data is below Q2, the second
quartile.
75% of our data is below Q3, the third
quartile. Algorithm: Q1
1. Perform the Median Algorithm.
2. Remove all datum above the median.
3. Perform the Median Algorithm on the
remaining data.
4. This is the middle of the lower half of the
data, the first quartile. Example
Given the data:
0.8, 11, 0.1, 0.6, 1,
0.3
1. Sort it
11.0, 0.8, 0.6, 0.1,
0.3, 1.0 2. RECALL
Dataset Characteristics
3 Characteristics
1. Center
2. Spread
3. Shape Spread
There are several
ways in which we
can calculate
spread: 3. 4. 1.
2. 5. Range
The range gives the distance between the
largest and smallest values.
Formula in Words: Formula with Notation: Interquartile Range
The interquartile range gives the distance
covered by the middle 50% of the data.
Formula: Data:
Which dataset has
more spread?
A) 1
B) 2
C) 3
D) 1 = 2
E) none of the above Data 1:
1, 2, 3
Data 2:
1, 1, 1, 2, 3, 3, 3
Data 3:
100, 100.5, 101,
101.5, 102 Range Calculation
Data 1:
1, 2, 3
Data 2:
1, 1, 1, 2, 3, 3, 3
Data 3:
100, 100.5, 101, 101.5, 102 IQR Calculation
Data 1:
1, 2, 3
Data 2:
1, 1, 1, 2, 3, 3, 3
Data 3:
100, 100.5, 101, 101.5, 102 Standard Deviation
In words:
the standard deviation is approximately the
average distance the data values are from
the center. Formulas
1. Not nice for Calculation, but great for
interpretation. Formulas
2. Useful for calculation but NOT
interpretation. Formulas
3. Another one! Useful for calculation but
NOT interpretation. 3 Formulae Example
Consider the data 1, 2, 3. Calculate the st.
dev. Example
Given: 10 ∑i 10 2
i x i=1 0 ; ∑ i x =1 5 0 0 Calculate the standard deviation: Example
Given: 10 ∑i 10 2
i x i=1 0 ; ∑ i x =1 5 0 0 Calculate the standard deviation: Interpretation
Deviation
Definition in words: Definition numerically: Standard Deviation
The standard deviation is approximately, the
average deviation.
Why approximately???? Deviation Example
Consider the data: 1, 2, 3
Calculate the average deviation: Clicker Question
Pick 3 numbers. Calculate the average
deviation. The answer is:
A)
0
B)
>0
C)
<0
D)
I just want the clicker mark.
E)
None of the above. What's the problem!!!??? How do we correct it????? Other Issues
But….
1.Square rooting doesn’t undo squared
terms!
Example:
√(12+22+32) ≠ √12+ √22 + √32
2. Because of “1”, our value for s is too small,
so we divide by n1 instead of n. n vs n1 Degrees of Freedom
n1 is called the degrees of freedom.
Another way of thinking about degrees of
freedom:
Suppose I gave you n data values at
random. They are “free” to be whatever I
want them to be. Degrees of Freedom Continued
Now, instead of n data values, I give you n1
values + the average. Is that last data
value, the nth, “free”? Range Vs. Standard Deviation
(Typical Plot) Standard
Deviations
minimum Center maximum Maximum  Mininmum = 6s
Which means....
s = Range/6
Note: Sometimes it is not 6 but 4 or another
constant...this depends on the data. Interpretation
For a set of data, the standard deviation is 5.
Is this big, small or uncertain?
A)
Big
B)
Small
C)
Uncertain Interpretation and Units Variance
The variance is merely the square of the
standard deviation.
Notation: Coefficient of Variation (CV)
Formula: Interpretation/Use: Example
The length of fish Riley catches m's on
Monday: 1, 2, 3 In cm's on Tuesday: 100, 200, 300 Surface Investigation
Monday Tuesday Which has the
greatest spread?
A) Monday
B) Tuesday
C) Neither Answer Units
The standard deviation, mean, mode,
median all have the same units as the
data.
The variance, which is equal to standard
deviation squared has units squared. Graphical Techniques
In addition to numeric techniques, we have
graphical techniques that can be used to
analyze data.
These graphical techniques include
boxplots, dot plots etc… Example Dataset
Consider the following data:
1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7
We can build a display simply by ticking off
every time we see a number. 1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7 Dotplots
A dot plot is similar to this tick mark game
that we've played since children. Each
data value is plotted and replaced by a
point.
Hence the data 1,2,3 would look like: 1 2 3 Dotplots with Repeats
For a single set of data we may be
interested in the repeats. In such a case
we may draw a dot for every repeat.
Eg. 1,1,2,3 1 2 3 Example: Soybean What can you see with this
plot?? Frequency Distribution Example
Example: Who is your favourite actor?
A) Brad Pitt
B) This guy C) Angelina Jolie
D) Her E) Someone else/don't want to answer Frequency
We build bars which have a height equal to
the frequency with which a response
occurs. NonCategorical Data
If our data is not categorical, we first build
intervals for the data.
Intervals are created subjectively but should
all be the same size.
The x axis contains the intervals while the y is
the frequency. Example: Grades
What is your Calculus 1 grade?
A) 85% to 100%
B) 70% to 85%
C) 55% to 70%
D) 40% to 55%
E) Prefer not to say. Intervals
These intervals are chosen subjectively.
I could have chosen any set. I did try to
chose them to make them all the same
size. Clicker Questions
The shape is:
A) Bell
B) Skewed left
C) Skewed right
D) uniform (flat)
E) none of the above Clicker Questions
The center is:
A) 576
B) 578
C) 579
D) 581
E) none of the above Relative Frequency Example
We divide each freqency by n.
The plot is otherwise the same. Example 0 .2 0
0 .1 5
0 .1 0
0 .0 5
0 .0 0 D e n s ity 0 .2 5 0 .3 0 0 .3 5 D e p th o f L a k e H u r o n in F e e t 1 8 7 5  1 9 7 2 575 576 577 578 579 L a k e H u ro n 580 581 582 Clicker Question
What is the proportion of times that lake
Huron was less than 578 feet deep?
A) 10%
B) 12%
C) 24%
D) Not able to say. Boxplots Unmodified Boxplot
Min Q1 Q2 IQR=Q3Q1 Range = Max  Min Q3 Max Recall: Outliers
Outliers: Data values that are more extreme
(larger or smaller) than the others.
E.g. 1,1,2,2,3,3,4,4,5,5,6,6,25 Finding Outliers
What is an outlier mathematically? Obviously from
the data above the number 25 is suspect.
Any value that is:
Less than the lower limit: LL=Q11.5(IQR) Greater than the upper limit: UL= Q3+1.5(IQR) Why 1.5 times?? Math to Prove 25 is an Outlier
1,1,2,2,3,3,4,4,5,5,6,6,25 Example Continued
1,1,2,2,3,3,4,4,5,5,6,6,25 Example Continued
1,1,2,2,3,3,4,4,5,5,6,6,25 Modified Boxplot
Unless stated otherwise I am asking
about the modified boxplot!
The difference: The upper whiskers are
either the maximum or the closest point
below the UL to the center.
The lower whiskers are either the minimum
or closest point to the LL, which ever is
closer to the center. Modified Boxplot
Q1 Q2 outlier IQR=Q3Q1 Range = Max  Min Q3 Example Using:
1,1,2,2,3,3,4,4,5,5,6,6,25 Boxplots and Shape
• The box (Q1 to Q3) gives a good
indication of the shape of our data.
» A » »C B Boxplot A is:
A) Symmetric (Bell)
B) Skewed left
C) Skewed right
D) Uniform (flat)
E) None of the above. Boxplots and Shape
• The box (Q1 to Q3) gives a good
indication of the shape of our data.
» A » »C B Boxplot B is:
A) Symmetric (Bell)
B) Skewed left
C) Skewed right
D) Uniform (flat)
E) None of the above. Stem And Leaf Plots Loss of Information
Individual data values are lost when we
draw a boxplot, histogram, dot plot etc…
The Stem and Leaf plot attempts to counter
this issue. Example:
Problem: Measurements of the annual flow
of the river Nile at Ashwan 1871–1970.
Plan: Not relevant. Data
1120 1160 963 1210 1160 1160 813 1230 1370
1140 995 935 1110 994 1020 960 1180 799
958 1140 1100 1210 1150 1250 1260 1220 1030
1100 774 840 874 694 940 833 701 916
692 1020 1050 969 831 726 456 824 702
1120 1100 832 764 821 768 845 864 862
698 845 744 796 1040 759 781 865 845 944
984 897 822 1010 771 676 649 846 812
742 8011040 860 874 848 890 744 749 838
1050 918 986 797 923 975 815 1020 906
901 1170 912 746 919 718 714 740 Stem and Leaf Plot Parts
The decimal point is 2 digit(s) to the right of the 
46
5
6  5899
7  000123444455667778
8  000011222233344555556667779
9  0011222244466678899
10  0122234455
11  00012244566678
12  112356
13  7 Stem and Leaf Plot Example
The decimal point is 2 digit(s) to the right of the 
46
5
6  5899
7  000123444455667778
8  000011222233344555556667779
9  0011222244466678899
10  0122234455
11  00012244566678
12  112356
13  7 Stem and Leaf Plot
What do you notice????
The decimal point is 2 digit(s) to the right of the 
46
5
6  5899
7  000123444455667778
8  000011222233344555556667779
9  0011222244466678899
10  0122234455
11  00012244566678
12  112356
13  7 Parts
1) Legend: “The decimal point is 2 digit(s) to the
right of the ”
a) This tells me that the numbers are 46=460.
b) If it had said “2 digit(s) to the LEFT of the ” then 4
6=0.046 2) Stem is the part to the left of “”
3) Leaves are the parts to the right of the “”
4) Each leaf represents a data value. Hence we
have 6 data values starting with 12. Example
Measurements of vein diameters were taken
on 100 patients. The following stem and
leaf plot was obtained. Example Continued
The decimal point is 2 digit(s) to the left of the 
32  78
33  224
33  5577777899
34  0000011111233333444
34  5566666678888888999
35  0001111111122223344
35  5555677788889999
36  0112244
36  56678 Based on the Legend 321 Means:
A) 321
B) 32.1
C) 3201
D) 3.21
E) None of the above The decimal point is 2 digit(s) to the
left of the 
32  78
33  224
33  5577777899
34  0000011111233333444
34  5566666678888888999
35  0001111111122223344
35  5555677788889999
36  0112244
36  56678 What do you notice that is
interesting about the stems???
Why was this done?? The decimal point is 2 digit(s) to the
left of the 
32  78
33  224
33  5577777899
34  0000011111233333444
34  5566666678888888999
35  0001111111122223344
35  5555677788889999
36  0112244
36  56678 Example:
Problem: Does the stress of machinery
affect the ability of a soya plant to grow?
Further, does the amount of light influence
it’s ability to grow? Plan:
52 seeds were potted with one seed per pot. The
52 seeds were randomly divided into 4 samples
with 13 seeds per sample. The seeds in 2
samples were stressed by being shaken for 20
minutes daily, while the seeds in the other two
were not shaken (no stress). The two samples
that received the same exposure to stress were
grown under different levels of light. Thus the
four samples of plants were allocated to one of 4
treatments that were defined by 2 basic
treatments, stress and light. Data:
ln ly mn my 264 235 314 283 200 188 320 312 225 195 310 291 268 205 340 259 215 212 299 216 241 214 268 201 232 182 345 267 256 215 271 326 229 272 285 241 288 163 309 291 253 230 337 269 288 255 282 282 230 202 273 257 Analysis: Under which conditions would you want to
grow your Soybeans?
A) B)
C) D) Moderate Light,
Stress
Low Light, Stress
Moderate Light,
no stress
Low light, no
stress Example 2  View Article
From:
Medical Article
http://www.amstat.org/publications/jse/v11n2/datasets.heinz.html Problem: To
investigate the human
body.
Plan: Measure the
items shown at left on
males and females.
Data: Measurements
of 247 men & 260
women
Analysis: See article
on last slide. Is is possible for the
Biacromial
Measurement of a
particular female
exceeds that of a
particular male? Yes
B) No
C) zzzzzzz
A) Probability
We can define probability in 3 ways. Subjective
Relative frequency
Mathematical / classical Subjective
Based on intuition we guess what the
probability is.
i.e. There’s a 99% chance I’ll pass! Subjective
Adv: Disad: Relative frequency
The probability of something happening is the number
of times it occurs divided by the # of attempts.
e.g. Coins
Pretend everyone in class is using the same coin. Flip it.
What did you get??
A) Heads
B) Tails Question
Will you write the quizzes more than once
even if you got 100% on the first try?
A) Yes
B) No Relative Frequency
Adv: Disad: Classical
Experiment A theoretically repeatable process or
phenomenon
e.g.
Trial e.g. One repetition of an experiment Classical ctd.
Outcome The result of our experiment.
Also called a “simple” event
We use capital letters to denote outcomes
e.g. A Classical continued
Compound
Event:
e.g. If an event A is made up of more
than one “simple event” Classical Ctd
Universe or Sample Space:
The collection of all outcomes of an
experiment.
We denote it by “S”.
e.g. Review
An outcome might be A = roll a one
An event might be, get an even #,
B = {2, 4, 6}
The size of an event/sample space is the
objects/simple events in it. We
size by B
e.g. B = {2, 4, 6} B = 3 # of
denote the Probability
Let E be an event containing E simple
outcomes.
Let S be the sample space with S simple
outcomes.
Then the probability E occurs is Pr(E)=E/S Example
1. What is the probability of getting a head on a
coin? e.g.
A biologist classifies a colony of wild baboons
by fur colour.
E = having lightcoloured fur
Of 150 animals observed, 5 are lightcoloured
P (lightcoloured fur) = Example
In a genetic experiment brown rabbits are
crossed with black rabbits. As a result, of the
44 progeny, 13 are brown and 5 are black. The
remainder are mottled (various colours). What
is the probability you select a mottled rabbit? Properties of Probabilities
1. 2. Properties of Probabilities
3. 4. Properties of Probabilities 1 0 P)1
) ≤( ≤
E 2)P ( E ) = 0 ≡ E never happens 3 P = E ah e
) ( 1 a ya n
E ≡ l sps
wp
)
4 EEK ,E representssimple events mutually
) 12
, , m exclusive all possible and
P1+(2+ +(m=
EE
E
( ) P )K P )1 Leading Questions
What if…
we want to know the probability we select either a
brown OR mottled rabbit?
We want to know the probability that in 2 tries we
select a brown AND a mottled rabbit? Symbol 1
“OR”
Notationally we write:
In words we mean: Symbol 2
“AND”
Notationally we write:
In words we mean: Symbol 3
“Not”
Notationally we write:
In words we mean: Venn Diagrams
A Venn diagram is a pictorial representation of our
probability
The box is the sample space.
e.g. A circle within the box denotes a probability for an event.
e.g. Mutually Exclusive
Two events are mutually exclusive (ME) if they have no
outcomes in common or cannot occur together.
e.g. ME Events: e.g. Not ME Events: Clicker ME
Is the event “Person wears glasses” mutually
exclusive from the event “Person has freckles”?
A) Yes
B) No
C) Uncertain Mutual Exclusion
Are the events A = Roll a one on dice 1; B =
Roll a one on dice 2; mutually exclusive
(ME)?
A)
Yes
B)
No Venn Diagram
In the following Venn diagram, the square
represents the…(best answer) A)
B)
C)
D) Event
Simple Event
An Outcome
Sample Space ME
ME and VENN Diagrams
If two events are ME or disjoint, the circles are also
disjoint.
e.g. Hence P o) P)P)
r r =( + ( .
( Br
A
Ar
B Or in terms of our notation: ME and VENN Diagrams
If two events are not ME, they overlap: e.g. Hence P o ) P ) P) P B
r r =( +( −( )
( Br
A
Ar
Br
A Proof by Picture: P o ) P ) P) P B
r r =( +( −( )
( Br
A
Ar
Br
A Example
Problem: To investigate Seal pup fur colour.
Plan: Pups Categorized by Coat Colour and Sex Data
Sex
Colour Male Female Total Yellow 25 10 35 Thin White 10 5 15 Fat White 25 5 30 Grey 15 5 20 Total 75 25 N = 100 Notation
Let G denote Grey.
Let Y denote Yellow.
Let M denote Male.
Let W denote White.
Let T denote Thin. Question 0
What is the probability
a pup is not thin and
white? Sex
Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N=
100 Question 1
What is the probability
a coat is Yellow? Sex B) 10/100
C) 35/100
D) 25/75 M F Total Y A) 25/100 Colour 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N=
100 Details Details....
Sex
Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N=
100 Question 2
What is the probability
a coat is Yellow or
Grey? Sex
Colour M F Total Y 25 10 35 TW 10 5 15 B) 40/100 FW 25 5 30 C) 55/100 G 15 5 20 D) 40/75 Total 75 25 N=
100 A) 25/100 E) None of the Above Details Details....
Sex
Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N=
100 Question 3
What is the probability
a randomly selected
pup is yellow and
male?
A) 85/100
B) 75/100
C) 35/100
D) 25/100 Sex
Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N=
100 Details Details....
Sex
Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N=
100 Question 4
Are the events yellow
and male ME? Sex
Colour M F Total A) Yes Y 25 10 35 B) No TW 10 5 15 C) Can't say FW 25 5 30 G 15 5 20 Total 75 25 N=
100 Question 4  Start
What about Yellow OR
male?? What is the
probability a randomly Colour
selected pup is yellow Y
OR male? Sex
M F Total 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N=
100 Details, Details…
What about Yellow OR
male?? What is the
probability a randomly Colour
selected pup is yellow Y
OR male? Sex
M F Total 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N=
100 Independent Events
CAREFUL!
Independence is a statistical (read
MATHEMATICAL) concept. Two events are
independent IF
Pr(AB) = Pr(A)Pr(B) Also, and this is mathematically subtle,
IF Pr(AB)=Pr(A)Pr(B)
then A and B are independent. Independence in WORDS
Two events are independent (not associated) if the
chance that one event occurs is not affected by the
knowledge of whether or not the other event occurred. Smoking
It is known that 5% of people get lung cancer, 10%
of people smoke and 0.5% of people smoke and get
lung cancer.
Is smoking independent of lung cancer ?
A) Yes
B) No
C) Too little information to say. Details Details... Example
Problem: To investigate the demographic of
those people who watch “House”.
Plan: Divide by gender and age. Data
Sex
Age Male Female Total Youth (<=16) 25 10 35 Gen X (17 to 35) 10 5 15 Middle (36 to 64)
Senior (>=65) 25 5 30 15 5 20 Total 75 25 N = 100 Notation
Let G denote Gen X.
Let Y denote Youth.
Let M denote Middle.
Let T denote Senior. Question
Is the event Youth,
independent of the
event Male? Sex
Age M F Total Y 25 10 35 G 10 5 15 B) No M 25 5 30 C) Not enough
information T 15 5 20 Total 75 25 N=
100 A) Yes Details Details....
Sex
Age M F Total Y 25 10 35 G 10 5 15 M 25 5 30 T 15 5 20 Total 75 25 N=
100 With and Without Replacement
With Replacement – Without Replacement – After selecting an item we
replace it. After selecting an item we
do NOT replace it. E.g. Dice, coins, selecting
a card and replacing it in
the deck Eg. Dealing cards Example
What is the probability that,
in selecting 2 male
viewers at random in a
row and with
replacement? Sex
Age Total B) 75%
C) 150%
D) None of the above F Y 25 10 35 G
A) 56.25% M 10 5 15 M 25 5 30 T 15 5 20 Total 75 25 N=
100 Details Details....
Sex
Age M F Total Y 25 10 35 G 10 5 15 M 25 5 30 T 15 5 20 Total 75 25 N=
100 Example
What is the probability of
selecting 2 male viewers
in a row at random and
without replacement? Age Sex
Total M F Y 25 10 35 A) 55.9% G 10 5 15 B) 56.06% M 25 5 30 T 15 5 20 Total 75 25 N=
100 C) 56.25%
D) None of the above Details Details....
Sex
Age M F Total Y 25 10 35 G 10 5 15 M 25 5 30 T 15 5 20 Total 75 25 N=
100 Independence Vs. ME
Independence
Pr(AB)=Pr(A)Pr(B) ME
Pr(AB) = 0 Questions
Can you think of 2 events which are…
Independent but are not ME?
Independent but are ME?
Not independent but are not ME?
Not independent but are ME? Conditional Probability
Often we ask questions like: What is the probability I pass if I study?
What is the probability I win a race if I stretch first
What is the probability I get a job if I don’t wear ripped
jeans? In all cases we are asking for a probability under the
assumption something else has already occurred. Conditional Probability
In Words: Conditional Probability
Mathematically/Formulaically: Using dice...
What is the probability of rolling a 2 or 3 given the
number rolled is odd?
Let A be the event odd.
Let B be the event of a 2 or 3. Numerical Venn Diagram: Conditional Probability
Pictorially with a Venn Diagram: Example
e.g. The probability Joe studies for an exam is 75%.
The probability Joe passes and studies is 55%.
Find the probability Joe passes given he studies. Step 1: Diamond Mining….. e.g. The probability Joe studies for an exam is 75%.
The probability Joe passes and studies is 55%.
Find the probability Joe passes given he studies. Step 2: Putting it together…. Example
e.g. The probability that a company goes bankrupt
given their stock has decreased in value at least 17% this
year is 26%. The probability that the stock decreased in
value by at least 17% is 62%. What is the probability that
a company goes both bankrupt and has at least a 17%
decrease in stock? e.g. The probability that a company goes bankrupt
given their stock has decreased in value at least 17% this
year is 26%. The probability that the stock decreased in
value by at least 17% is 62%. What is the probability that
a company goes both bankrupt and has at least a 17%
decrease in stock? Conditional Probability
Tree Diagram Example Tree Diagrams
What? A display of branches and nodes. At
each node we make choices (branches).
Use when? In conditional probability
problems.
How? We label each branch as Pr(A…)
where “…” are the events that have
occurred before A. Method
1. 2.
3.
4.
5. Read the question and find the two
events.
Determine which even is first.
Determine the probabilities, Pr(A…)
Label a tree with the probabilities.
Find Pr(AB) for each branch. Recall
The conditional probability formula: Example Tree Diagram Tree Diagram Notes
1. Adding the branches “Pr(AB)” gives a total of
100%. Tree Diagram Notes
2. Along a branch we say the word AND
Study Pass  Study Study AND Pass
Study 3. Between branches we say the word OR OR
Not Study AND PassNot Study HIV Example HIV Example
A certain test of HIV is
correct 95% of the
time if a person has HIV
and 98% of the time
if the person does not
have HIV.
8% of people tested are
thought to be HIV
positive. 1. 2. 3. 4. 5. Read the question and
find the two events.
Determine which even is
first.
Determine the
probabilities, Pr(A…)
Label a tree with the
probabilities.
Find Pr(AB) for each
branch. HIV Example
A certain test of HIV is
correct 95% of the
time if a person has HIV
and 98% of the time
if the person does not
have HIV.
8% of people tested are
thought to be HIV
positive. 1. 2. 3. 4. 5. Read the question and
find the two events.
Determine which even is
first.
Determine the
probabilities, Pr(A…)
Label a tree with the
probabilities.
Find Pr(AB) for each
branch. Events, Outcomes and Order
A certain test of HIV is correct 95% of the
time if a person has HIV and 98% of the time
if the person does not have HIV. 8% of people
tested are thought to be HIV
positive.
What is the order of the events?
A) Have HIV (H) then Test Positive (P)
B) Test Positive (P) and then Have HIV (H)
C) none of the above HIV Example
A certain test of HIV is
correct 95% of the
time if a person has HIV
and 98% of the time
if the person does not
have HIV.
8% of people tested are
thought to be HIV
positive. 1. 2. 3. 4. 5. Read the question and
find the two events.
Determine which even is
first.
Determine the
probabilities, Pr(A…)
Label a tree with the
probabilities.
Find Pr(AB) for each
branch. HIV Example
A certain test of HIV is
correct 95% of the
time if a person has HIV
and 98% of the time
if the person does not
have HIV.
8% of people tested are
thought to be HIV
positive. HIV Example
A certain test of HIV is
correct 95% of the
time if a person has HIV
and 98% of the time
if the person does not
have HIV.
8% of people tested are
thought to be HIV
positive. 1. 2. 3. 4. 5. Read the question and
find the two events.
Determine which even is
first.
Determine the
probabilities, Pr(A…)
Label a tree with the
probabilities.
Find Pr(AB) for each
branch. Using Numbers
Pr (PH)
Pr (H) = 8% = 95%
Pr (Not PH)
= 5% Pr (not Pnot H)
= 98%
Pr (Not H) = 92% Pr (PNot H)
= 2% HIV Example
A certain test of HIV is
correct 95% of the
time if a person has HIV
and 98% of the time
if the person does not
have HIV.
8% of people tested are
thought to be HIV
positive. 1. 2. 3. 4. 5. Read the question and
find the two events.
Determine which even is
first.
Determine the
probabilities, Pr(A…)
Label a tree with the
probabilities.
Find Pr(AB) for each
branch. Using Numbers
Pr (PH)
Pr (H) = 8% Pr(PH)=7.6% = 95%
Pr (Not PH)
= 5% Pr(not PH)=0.4% Pr (not Pnot H)
= 98% Pr(not P, not H)=90.16% Pr (Not H) = 92% Pr (PNot H)
= 2%
Pr(P not H)=1.84% Answering the Questions…
0. What is the probability of
having HIV and testing
positive for HIV?
A) 7.6%
B) 8%
C) 95%
D) 90.16% Solution Answering the Questions…
1. What is the probability of
testing positive for HIV?
A) 7.6%
B) 9.4%
C) 95%
D) 90.16% Solution Answering the Questions…
2. What is the probability a randomly selected
person does not have HIV given they tested
positive? Solution Question 3: Aviation
In any crash landing it is known that a black box will
survive (be found, still work) 85% of the time. After a
flight there is a 0.5% probability that a black box will
not work. The black box is tested after each flight.
Airplane crashes occur on 1% of flights.
What is the probability that a black box will survive a
flight? Method
1. 2.
3.
4.
5. Read the question and find the two
events.
Determine which even is first.
Determine the probabilities, Pr(A…)
Label a tree with the probabilities.
Find Pr(AB) for each branch. Conditional Probability and
Independence
Recall: Two events are independent
(not associated) if the chance that
one event occurs is not affected by
the knowledge of whether or not the
other event occurred. Proof
Let A and B be independent events:
Then Pr(AB) = Example
What is the probability of flipping a head on
the next flip of a fair coin given we have
already flipped 100 heads in a row?
A) 0
B) Very very small
C) 0.5
D) 120%
E) none of the above Sampling without Replacement
There are 12 people in my class, 7 males
and 5 females. I select two people at
random and without replacement. What is
the probability that I my first selected person
is male if it is known I selected one male and
one female? Method
1. 2.
3.
4.
5. Read the question and find the two
events.
Determine which even is first.
Determine the probabilities, Pr(A…)
Label a tree with the probabilities.
Find Pr(AB) for each branch. There are 12 people in my class, 7 males and 5
females. I select two people at random and without
replacement. What is the probability that I my first
selected person is male if it is known I selected one
male and one female? Method
1. 2.
3.
4.
5. Read the question and find the two
events.
Determine which even is first.
Determine the probabilities, Pr(A…)
Label a tree with the probabilities.
Find Pr(AB) for each branch. There are 12 people in my class, 7 males and 5
females. I select two people at random and without
replacement. What is the probability that I my first
selected person is male if it is known I selected one
male and one female? Monte Hall Problem
Monte Hall was the
game show host of
“Let’s Make a Deal”.
The game worked
as follows: 1. Pick a Door A B C 2. You pick 1 door… A B C 3. Monte Hall Shows you a dud… A B C 4. And asks if you would like to
switch your choice from A to B…? A B C Do you select door A (no switch) or
B (switch)?
A)
B) No switch from A.
Switch to door B. The question…does switching change your
chances of winning?
A) If you switch your chance of winning is greater.
B) If you do not switch your chance of winning is
greater.
C) Doesn’t matter if you switch, your chance of
winning is 0.5. Experiment
In groups of 2 (a host and a contestant),
draw three doors.
The host will pick a winning door but NOT
tell the contestant. Experiment Part 2
The contestant will pick a door.
The host will then tell the contestant which
of the other two doors is a dud. Experiment Part 3
Finally, the host will let the contestant either
stick with their door or switch. After the
contestant has made their choice, the host
will tell them where the right answer is. Switch Data
A)
B) You switched and WON
You switched and LOST C)
D)
E) You did not switch and WON
You did not switch and LOST The Events
Define the events: Which is first?
A)
B) Selected Door
Switch Tree Diagram Solution – Do you have a higher
chance of winning if you switch??? Random Variables Random Variables
Definition: A random variable is a variable (think x, …) that
depends on the outcomes of a chance operation
Concept: It turns outcomes into numbers e.g. Coin Flipping Random Variable
Notation:
Note: X, capitalized is random variable (r.v.) x vs X e.g. Coin Flipping f(x)
f(x) = Pr(X=x) –> the probability that X
becomes x.
We call this a probability function. It has a
value for every value of x.
Since f(x) is a probability it has all the same
properties as a probability. Example 1
f(x) = Example 2
We often build a table of x and f(x) values,
called a distribution.
e.g. Coin Flipping Probability Properties
1. 2. Histogram
A diagram for a distribution. Histogram
We draw a bar for every value x
and height f(x). The area/height of
the bar represents the probability of
x occuring. Example
For the distribution below, find c.
x
1
0
2
f(x)
0.3
c
0.6 Clicker Example
How many siblings do you have; A=0, B=1,
C=2, D=3, E>=4? Notice how the clicker builds for us a
(Relative) Frequency Diagram, here after
called a histogram. Example 1
In a particular stock portfollio consisting of
100 companies 33 are considered to have
no risk, 21 are considered to be
conservative, 42 are moderately risky and
the remainder are risky.
Build the distribution and histogram for the
above data.
(continued next slide) In a particular stock portfollio consisting of
100 companies 33 are considered to have
no risk, 21 are considered to be
conservative, 42 are moderately risky and
the remainder are risky. Wording
Consider the numbers 1, 2, 3, 4, 5, 6.
If we say that our answer is at most 4, then
our answer can
A)
Include 4
B)
Not include 4 Wording
Consider the numbers 1, 2, 3, 4, 5, 6.
If we say that our answer is at least 4, then
our answer can
A)
Include 4
B)
Not include 4 Wording
Consider the numbers 1, 2, 3, 4, 5, 6.
If we say that our answer is less than 4, then
our answer can
A)
Include 4
B)
Not include 4 Example 2
Let X be the number of rabbit progeny from
one union. The probability distribution for
X is below .
X
5
f(x) 0.25 0 1
0.3 0.05 2 3 4 0.22 0.17 0.01 Example 2 Continued
What is the probability that a rabbit union
results in less than 2 progeny?
X 0 1 2 3 4 5
f(x) 0.25 Clicker:
0.3
A)
B)
C)
D) 60%
55%
30%
25% 0.05 0.22 0.17 0.01 Example 2 Continued
What is the probability that a rabbit union
results in at most 2 progeny?
X 0
5 1 f(x) 0.25 0.3
Clicker:
0.01
A)
B)
C)
D) 60%
55%
30%
25% 2 3 0.05 0.22 4
0.17 Cummulative Distribution
Function
The probability distribution function is f(x) =
Pr(X=x).
The cummulative distribution function is
F(x)=Pr(X<=x).
1.
2. f(x) = F(x) – F(x1)
Pr(X>x) = 1Pr(X<=x) Properties Example 1
A polar bear gives birth to 0, 1 or 2 live cubs
with probability 0.1, 0.2 and 0.7.
A) Draw the probability distribution function B) State and Draw the cumulative
distribution function. Recall
The PDF (or PF) is denoted by f(x) =
Pr(X=x).
The CDF, denoted by F(x) = Pr(X<=x), has
the following useful relationships:
i. Pr(X>x) = 1 – Pr(X<=x) = 1 – F(x)
ii. f(x) = F(x) – F(x1) Example 2
The number of celery seeds that germinate
in a packet of 5 seeds has the following cdf:
X
0
1
2
3
4
5
F(x) 0.1 0.2 0.3 0.5 0.8 1 Questions follow….. Example 2
X
0
1
2
3
4
5
F(x) 0.1 0.2 0.3 0.5 0.8 1
A)
What is the probability that less than 2
seeds germinate? Example 2
X
0
1
2
3
4
5
F(x) 0.1 0.2 0.3 0.5 0.8 1
B) What is the probability that exactly 2
seeds germinate? Histogram Analyses
When we look at a histogram, what things do
you think we are interested in?? Expectation and Variance Notation
n– N– Equiprobable – Populations Vs Samples
Pictorially: Parameters Statistics Populations vs Samples
Example: Expectation
Formula: Expectation
Concept: e.g. I roll la die 3 times and get 3, 2, 4 a) What is the sample mean? b) What is the expected value? Variance
Formula: Variance
Concept: e.g. I roll la die 3 times and get 3, 2, 4 a) What is the sample variance? b) What is the long run variance? Question
In a game of chance the outcomes are 1, 1,
3 with probabilities 0.1, 0.3 and 0.6. Which
outcome is most probable?
A) 1
B) 1
C) 2
D) 3 σ vs s
If σ is the population standard deviation and
s is the sample standard deviation
then…
A)
σ=s
B)
σ>s
C)
σ<s
D)
σ=s Properties of Variance and Mean
I want to show you how a change to your
data (i.e. multiply the values by 2.5), affects
our statistics. A Silly Math Proof
Let X represent your ‘data’…. Properties of Expectation
Properties: 1 E + =( ) c
) (cE+
X) X Proof by example:
X c X+c 1 2 3 2 2 4 3 2 5 You told your apprentice to measure
something in cm’s but your apprentice
is always off by 2 cms! How do you fix
your mean??? Properties of Expectation
Properties: 2 E) c
) (=
c Proof by example: Properties of Expectation
Properties: 3E ) cX
) ( = ()
cE
X Proof by example:
X c Xc 1 2 2 2 2 4 3 2 6 You told your apprentice to measure
something in cm’s but your apprentice
is always off by a multiple of 2! How do
you fix your mean??? Basil, join
the Dark
Side!!! Properties of Variance
1)Var (c) = 0 Properties of Variance
2)Var (cX ) = c 2Var ( X ) ← proof usingst.dev. Properties of Variance
3)Var (c +X) = Var (X) ← proof by argument picture e.g. The number of cubs born to polar bear mothers
given year is denoted by r.v. X
x
0 b) Find µ. 0.1 2 Find c. 0.3 1 a) Pr(X=x) c in a c) Find σ2. 0.3 1 0.1 2 Find the st. dev. Pr(X=x) 0 d) x c e) f) Find the probability a polar bear has less than 2 cubs. Find the probability a polar bear has more than 2 cubs. x Pr(X=x) 0 0.3 1 0.1 2 c Example
The temperature (in Celsius) in Ontario in
August was:
Temp
30 32 33
Frequency 5/30 15/30 10/30
Let X denote the temperature. Find E(X)
and Var(X). Example
Temp
30 32
Frequency 5/30
E(X) = 33
15/30 10/30 In Fehrenheit
Let Y be the temperature in fehrenheit. If
the relationship between celsius (X) and
fehrenheit is 9X/5+32 = Y.
Find E(Y) and Var(Y). Example
Temp
30 32
Frequency 5/30
Var(X) = 33
15/30 10/30 A New Tool: Factorial
Notation
n!
= n factorial
= n(n1)(n2)…(2)(1)
e.g.
3! =
A) 3
C) 5 B) 6
D) 12 Arrangements Interpretation Factorial, Special Case
We define n to be an integer greater than or
equal to zero.
What is 0!=? What is 1!=? A New Tool: The Choose
Function
Suppose I want to select 2 objects from 3.
For example, I want to select two letters
from the word BIO. Order does not matter.
Hence if I select BI or IB, I do not count
this twice.
In how many ways can I do this (order does
not matter)?
A) 1 B) 2 C) 3 D) 4 Choose Function Notation
The notation is: The formula is: Choose
e.g. 2 5
/) 3( )1
54 ( )2()
5
!
( / / =0
=
1 =
//
/
3 !5 3
! 32()[() 3( − ) [( )1]21] Choose Function – Special Cases
N choose 1 N choose 0 N choose N Binomial Distribution Example
In how many ways can I toss a coin 3 times and
get 2 heads?
A) 2
B) 3
C) 4
D) 5
E) None of the above. What is the probability of HHT?
A) (0.5)2(0.5)
B) (0.5)(0.5)2
C) 1/8
D) 12.5%
E) All of the above. Putting the last two slides
together…?
What is the probability of getting 2 heads and 1
tail in 3 flips? In general….????? BINOMIAL
The above is an example of a binomial probability function. An experiment where:
T
I
M
S Formula and Notation Expectation and Variance in
Formula Expectation by Example
If I flip a fair coin 10 times, how many heads do
you expect to get?
A) 3 to 8
B) 5
C) Depends on information not given
D) 0.5 Review BINOMIAL
Formula:
Concept: Pr ( X = x ) = n C x p x (1 − p ) n− x A binomial probability function occurs when we have an
experiment that follows: T Two outcomes
e.g.
Pass a course, fail a course
e.g.
Heads, tails
e.g.
0 or 1 I Independent trials
e.g. There is no chance trial one will affect trial 2. M Multiple trials
e.g. We flip a coin more than once S Same probability of success Notation: n
p
x Expectation
&
Variance  number of trials
probability of a “success”
# of successes, you see. n
n −x
E ( X ) = ∑ Pr ( X = x ) = ∑ x p x (1 − p )
= np
x V ( x ) = ∑( x − µ ) Pr ( X = x ) = σ 2
2 e.g. Mendel Genetics
It is known that when you cross a
black rabbit with a brown, 10% of the
progeny are mottled. Find the probability
that in a litter of 5 rabbits,
a)
3 are mottled b) At most 1 is mottled Experiment
You don’t need to write
this down… The Experiment
You will write a multiple choice test involving
4 questions on BioChem questions.
Let’s see how well you do…(there are no
marks for being right)…
Keep track of your answers. Class Section 1 Question 1
My middle name is?
A) Randolf
B) Anthony
C) David
D) Pierce
E) Adam Question 2
One of my sons middle names is:
A) Xiao
B) Cinder
C) Tae
D) Felix
E) Mike Question 3
My daughters age is:
A) 3
B) 5
C) 7
D) 9
E) You don’t have a daughter. Question 4
My original degree was in:
A) Pure Math
B) Applied Math
C) Combinatorics and Optimization
D) Actuarial Science
E) Operations Research Answers
1.
2.
3.
4. Be honest, mark yourself out of 4. Number Correct
A) 0
B) 1
C) 2
D) 3
E) 4 Theoretically???? Poisson Distribution POISSON
2 Formulas: Notation: Expectation and Variance in
Formula A note on Mu or Lambda Poisson Example 1
Sunspots appear according to a poisson
process, on average 5 times a year. In
thirteen years, how many sunspots would
we expect?
A) 5
B) 13 x 5
C) 13/5
D) 13
E) None of the above Poisson Example 2
I have a 4 year old. Stickers randomly
appear on the walls of my house at a rate of
4 per square meter. In 2 square meters,
how many should I expect?
A) 2
B) 4
C) 8
D) 16
E) None of the above Poisson Concept
I I H e.g. An employee checks his email 3 times per
5 minutes.
a) In five minutes, what is the probability the
employee checked his email 7 times. 2% e.g. An employee checks his email 3 times per
5 minutes.
b) In ten minutes, what is the probability the
employee checked his email 7 times. 14% e.g. An employee checks his email 3 times per
5 minutes.
c) In two minutes, what is the probability they
check their email less than 2 times? 66% Poisson Vs. Binomial
1. 2. A= Poisson, B=Binomial,
C=Other Fish travel upstream is schools of, on
average, 24 fish. Schools of fish appear, on
average, every 4 minutes. What is the
probability that 3 schools of fish appear in 12
minutes? A= Poisson, B=Binomial,
C=Other In a particular production process 12 widgets
are placed in a shipping box. To determine if
a box contains all 12 widgets, the weight is
obtained.
The probability that a widget is missing is
independently 0.22. What is the probability
that 2 are missing from a box? A= Poisson, B=Binomial,
C=Other Cars on the highway arrive at KW according
to a poisson process. On average 50 cars
arrive per hour. The probability that an hour
is “heavy” in traffic is 0.1. If hours are
disjoint, what is the probability that in 12
hours, 6 are “heavy” in traffic? e.g.
The number of deer found in a 1
acre plot appear homogenously,
independently and individually. Usually I
see two a day.
a) Find the probability I see 4 in the day?
4% e.g.
The number of deer found in a 1
acre plot appear homogenously,
independently and individually. Usually I
see two a day.
b) Find the probability I see 3 in the day? A) 18%
B) 22%
C) 36%
D) 54% E) None of the above e.g.
The number of deer found in a 1
acre plot appear homogenously,
independently and individually. Usually I
see two a day.
c) Find the probability I see less than 4.
86%. e.g.
The number of deer found in a 1
acre plot appear homogenously,
independently and individually. Usually I
see two a day.
d) Find the probability I see more than 4. A) 9%
B) 86%
C) 14%
D) 91% e) If seeing 4 deer in a day is called a
“quad”, find the probability I see 3 quad
in 10 days. I can see at most one quad
in a day. (4%) Example
Coop is always investigating student hiring.
On average 10 students are hired per day.
Students are hired independently.
A) In a work week, find the probability that 60
students are hired. Example
Coop is always investigating student hiring.
On average 10 students are hired per day.
Students are hired independently.
B) What is the probability that a dozen
students are hired in 1 day? Example
Coop is always investigating student hiring.
On average 10 students are hired per day.
Students are hired independently.
C) What is the probability that, in one work
week, there are 3 days of exactly a dozen
students employed? Continuous Random Variables Cts Case: Consider the continuous random variable X then
1)
2) The area beneath the curve f(x) is 1.
f(x) = 0 f(x) is called the probability density function.
f(x) is not a probability. Properties of Discrete Probability
Function Properties of Continuous Probability
Function A Note on Equality and
Probability The Discrete Cumulative
Distribution Function F(x) The Continuous Cumulative
Distribution Function F(x) BIGGEST HINT DRAW
PICTURES Pr(X<a) Pr(X>a) Pr(a<X<b) Pr(X<=a) vs. Pr(X<a)
A) Pr(X<=a) is larger
B) Pr(X<a) is larger
C) Pr(X<=a)=Pr(X<a)
D) none of the above Example
e.g. Consider the density f(x)=3x2 for x from 0 to 1. A) Sketch f(x). Note: F(x) = x3. Example Continued
e.g. Consider the density f(x)=3x2 for x from 0 to 1. B) What is the probability that x is less than 0.25? Example Continued
e.g. Consider the density f(x)=3x2 for x from 0 to 1. C) What is the probability that x is equal to 0.25? Example Continued
e.g. Consider the density f(x)=3x2 for x from 0 to 1. D) What is the probability that x is greater than 0.75? Normal Curves
At the start of the course we could divide the data into
1) Discrete
2) Continuous
Random variables can also be divided into these 2 groups. Mean &
Variance Normal Curves as X is Gaussian with
AlternativelyWe may write
•
Picture:mean µ and st. dev. σ. The concept is exactly
the same.
Picture
• Pdf: Cdf: Because the formula is ugly we tend to use a
Formula
table to calculate our probabilities
• • Used where? The Abonormal Curve???? The Standard Normal
• The standard normal is N(0,1). It has a
mean of 0 and variance of 1. Normal Distribution TRICKS
• The normal distribution is SYMMETRIC! Because of this.....
Pr(Z<a) = Why do we need a table??? Reading the Table
Let z=a.bc
i.e. 1 . 2 3
1. (i.e. a is the first digit, and bc are the next two after
the decimal).
2. Look up, in column 1, a.b
4.
Look up, in row 1, 0.0c
4. The intersection is the Pr(Z<a.bc) Tables Used in Course Example
Find the Pr(Z<0.25) Example
The probability that Z~N(0,1) is less than
0.63.
A) 0.63 B) 0.7357 C) 0.2643 D) 0.37 Example
The probability that Z~N(0,1) is less than or
equal to 0.63.
A) 0.63 B) 0.7357 C) 0.2643 D) 0.37 Example
The probability that Z~N(0,1) is more than
0.63.
A) 0.63 B) 0.7357 C) 0.2643 D) 0.37 Example
The probability that Z~N(0,1) is more than
0.63.
A) 0.63 B) 0.7357 C) 0.2643 D) 0.37 Example
The probability Z~N(0,1) is less than 0 is:
A) 0
B) 1
C) 0.5
D) Can't Determine Example
Find Pr(0.23<Z<0.46). Example
Let Z~N(0,1).
The probability that Z is less than 0.5?
A) 0.5199
B) 0.6915
C) 0.7088
D) 0
1. Example
Let Z~N(0,1).
2. The probability that Z is less than 1.46?
A) Cannot calculate
B) 0.0721
C) 0.9279
D) 0 Example
The probability that Z is between 1.46 and
0.5 is:
A) 0
B) 0.6194
C) 0.6915
D) 0.9279
3. Standard Normal Curve
1 Standard Deviation from 0 Interpretation and Relationship to
Non Standard Normal Standard Normal Curve
2 Standard Deviations from 0 Standard Normal Curve
3 Standard Deviations from 0 Standard Normal Curve Six Sigma and Range Normal Calculations
Calculating normal probabilities involves
converting the N(µ,σ2 ) το Ν(0 ,1 ). Z Score
Z score transforms a r.v. X~
N(µ,σ2 ) το Ζ∼ Ν(0 ,1 ).
The transformation is:
Z= Proof Goal N(µ,σ2 ) Ζ ∼ Ν(0 ,1 ) N(µ,σ2 ) Ζ∼ Ν(0 ,1 )
N(µ,σ2 ) Concept
Consider the data:
1,2,2,3,3,3,
4,4,4,4,5,5,
5,5,5,6,6,6,6,
7,7,7,8,8,9
The sample mean is 5 and standard deviation
is approximately 2. Graphically Transforming
Transform x=1 by the z score function using
mean 5 and standard deviation 2: Transforming 2
Transform x=6 by the z score function using mean 5
and standard deviation 2:
CLICKER
The answer is:
A)
0.5
B)
0
C)
0.5
D)
1
E)
None of the above Transformed Data
In the rest of the cases:
2.0, 1.5, 1.5, 1.0, 1.0, 1.0, 0.5, 0.5, 0.5,
0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.5,
0.5, 1.0, 1.0, 1.0, 1.5, 1.5, 2.0
The mean of the transformed data is 0 with
standard deviation 1. Graphically In words…. Example (Storyless)
Let X~N(60,4). What is the probability X is
more than 58? Method
1.
2. 3. IF we are given Z~N(0,1) goto 3.
IF we are given X~N(µ,σ2 ) then transform
and go to 3.
Calculate as before. Example
The income of the average Canuck
is normally distributed and is on
average 30 thousand dollars a
year with standard
deviation 10 thousand.
What is the probability a randomly
selected Canadian makes more
than 50 thousand dollars? St.Dev=10, Mean=30, X>50 Example
The income of the average Canuck
is normally distributed and is on
average 30 thousand dollars a
year with standard
deviation 10 thousand.
What is the probability a randomly
selected Canadian makes between
20 and 40 thousand? St.Dev=10, Mean=30, 20<X<40 Review
The process…
1.
If X~N(µ,σ2 ). Transform using Z score.
2.
When Z~N(0,1) make sure your
probabilities are Pr(Z<a) where a is
positive.
3.
If a is negative use symmetry.
4.
If ‘<‘ is ‘>’ use ‘1’
5.
Look up your probabilities in the table. Reading the Normal Table
Backwards Reading BACKWARDS
Instead of the probability, I want the z score.
1. Let p be the probability of interest.
2.
Look for it in the table
3. Look up the corresponding column, 0.0c
and row a.b
4. The intersection is the Pr(Z<a.bc) Z score…In reverse….
X~N(µ,σ2 ) Z ~ N(0,1) N(µ,σ2 ) Ζ∼ Ν(0 ,1 )
N(µ,σ2 ) Example
Find z, Pr(Z<z)=0.75 To Interpolate or Not… Example
The Pr(Z<z)=0.7357. What is z?
A) 0.63 B) 0.47 C) 0.53 D) 0.62 Example
Find z, Pr(Z<z)=0.25 …Work… Example
Find the x value (X~N(3,9)) when
Pr(X<x)=0.6915. Example
The income of the average Canuck
is normally distributed and is on
average 30 thousand dollars a
year with standard
deviation 10 thousand.
What is the income such that 95%
of Canadians make less than this
amount? …Work… Example
The income of the average Canuck
is normally distributed and is on
average 30 thousand dollars a
year with standard
deviation 10 thousand.
What is the upper and lower
bounds such that the middle 95%
of Canadians make between these
amounts? …Work… …Work… Central Limit Theorem (CLT)
Let Xi be a random variable with mean µ and
variance σ2
If we have n of these Xi’s and they are all
independent then
1. The mean:
2. The sum: Class Example
How many brothers and sisters do you have?
A) 0
B) 1
C)
2
D)
3
E)
4 Class Example Continued
With the people around you (our ‘random’
sample), take your answer(s) to the last
question and average them.
All of you can answer the following: Which
number below is closest to your average A) 0
B) 0.5
C) 1
D) 1.5
E) 2 What did you see? Proof Proof Proof Central Idea: CLT
It takes everything and makes it normal… Except The CLT In Words Logic Behind Reduced Variance:
Consider the data 4, 2, 10 Our variance is relatively ‘large’ meaning we
are far from the center. Logic Behind Reduced Variance:
The average of 2 values is: Variance has decreased – values are closer
to the mean. Example 1
Students in this course have a mean age of
19 with a standard deviation of 4. Assume
ages are normally distributed.
A) What is the probability that a randomly
selected student is younger than 20? Example 1
Students in this course have a mean age of
19 with a standard deviation of 4. Assume
ages are normally distributed.
B) What is the probability that the average
age of a group of 9 randomly selected
students is younger than 20? CLT
What is the Central Limit Theorem??? HARD Clicker Test Question
For sufficiently large n, we see that the
mean of our data is normally distributed.
What is the distribution of our original data??
A)
Normal
B)
Poisson
C)
Binomial
D)
We are uncertain Review
A) Set up the probability.
B) Given
1.
Mean – Standardize using: 2. A single value – Standardize using: 4. Total  C) At which point you will have a Z…so
convert to a probability but beware:
1)
The > symbol
2)
Negative values
3)
Probabilities less than 50% Example
The midterm average was 75 with standard
deviation 5. The grades are normally
distributed. What is the probability that…
A)
A randomly selected person has a grade
more than 80?
B) The average of 25 randomly selected
people’s grades is more than 80? Example
X~N(75,52)
A)
A randomly selected person has a grade
more than 80? Example
X~N(75,52)
B) The average of 25 randomly selected
people’s grades is more than 80? Binomial Approximation
Goal…to approximate the binomial with a
normal distribution! RECALL! Standardization
Recall: Let X ~ N(mu,sigma2)
Then RECALL! Binomial
Recall: Let X have a binomial distribution.
Then
E(X) =
Var(X) = Binomial to Normal
Thus we can approximate a binomial random
variable by a normal random variable:
X ~ Binomial (n, p)
Is approximately: Counts vs Proportions Example
I flip a coin 100 times (yep, I’m bored). What
is the probability I get more than 55 heads?
Method 1 – Using the binomial (Chuck Norris
seconds!) could do this in 2
(13.56) Example
I flip a coin 100 times (yep, I’m bored). What
is the probability I get more than 55 heads?
Method 2 – Using the normal approximation
to the binomial… When do we approximate!?
This approximation is:
 Best when n is really large and p is either
really small or really large.
 Performed when, as in the last example, we
are looking for what could be a lot of terms. PROBLEM!!!
The normal is continuous whereas the
binomial is discrete. There are issues with
approximating a binomial with a normal.
Consider: X~Bin(10,0.3) The exact probability that X is
less than 2 is: The approximate probability that
X is less than 2 is: Graphically – Binomial vs
Normal 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Graphically – Overlap 0 1 2 3 4 5 6 7 Normal Probability Questions
Confusion arises regarding WHEN to apply
a continuity correction.
The answer is: We apply it when the
underlying distribution is BINOMIAL (or
more to the point, discrete). Examples
In each of the cases below, is the underlying
distribution binomial or not? If it is apply a
continuity correction.
Key Words (not black and white):
Apply Continuity – Count, Proportion
Do Not – Average, mean, total Question 1
It is known that, of 500 people in a class,
230 are male. What is the approximate
probability that the number of males
selected at random for a sample of size 50
is more than 25?
A) Apply Continuity Correction
B) Do not apply Continuity Correction Question 2
It is known that the height of males is on
average 72 inches with a standard deviation
of 3 inches. What is the probability that 5
randomly selected males are on average
more than 75 inches?
A) Apply Continuity Correction
B) Do not apply Continuity Correction Question 3
Of 50000 applicants to American Idol, we
believe 500 can sing. What is the
approximate probability that in a sample of
10000 people over 10% can sing?
A) Apply Continuity Correction
B) Do not apply Continuity Correction Continuity Correction
To ensure that the approximation is more
accurate we make, what is called a continuity
correction. This implies we add or subtract
an amount from the value of X to get a better
approximation for the probability.
E.g. Next slide using X~Bin(10,0.3) Examples
In the examples below, normal areas will be
___________ and binomial areas will be
HIGHLIGHTED. Graphically – Pr(X<2) 0 1 2 3 4 5 6 7 Graphically – Pr(X<=2) 0 1 2 3 4 5 6 7 Graphically – Pr(X=2) 0 1 2 3 4 5 6 7 Pr(a<X<b)
If X is binomial then the continuity correction
for X should be…
A) Pr(a+0.5<X<b0.5)
B) Pr(a+0.5<X<b+0.5)
C) Pr(a0.5<X<b0.5)
D) Pr(a0.5<X<b+0.5)
E) None of the above. Pr(a<X<=b)
If X is binomial then the continuity correction
for X should be…
A) Pr(a+0.5<X<b0.5)
B) Pr(a+0.5<X<b+0.5)
C) Pr(a0.5<X<b0.5)
D) Pr(a0.5<X<b+0.5)
E) None of the above. Pr(a<=X<=b)
If X is binomial then the continuity correction
for X should be…
A) Pr(a+0.5<X<b0.5)
B) Pr(a+0.5<X<b+0.5)
C) Pr(a0.5<X<b0.5)
D) Pr(a0.5<X<b+0.5)
E) None of the above. Binomial Approximations with
Counts
Definition: Example
I flip a coin 100 times (yep, I’m bored). What
is the probability I get more than 55 heads?
Approximate using a continuity correction…
recall…
Exact = 13.56
Approximate = 15.87
(no continuity correction) Continuity Correction Example
Let X~Bin(10,0.3). Find Pr(X<2).
From last class we saw:
1. The exact probability is 13.56%
2. The approximate probability without a
continuity correction is 15.87%
3. The approximate probability WITH a
continuity correction is __________ Binomial Approximations with
Proportions
Definition: Example
(Without Continuity Correction)
The Leafs have won 1 game in 9 attempts. What is the approximate probability they will
win more than 50% of the of the next 20
games they play? Example
(WITH Continuity Correction)
The Leafs have won 1 game in 9 attempts. What is the approximate probability they will
win more than 50% of the of the next 20
games they play? Example
(As a COUNT!)
The Leafs have won 1 game in 9 attempts. What is the approximate probability they will
win more than 50% of the of the next 20
games they play? WHERE NEEDED WE
ALWAYS ALWAYS
ALWAYS USE A
CONTINUITY
CORRECTION!!!!! Populations
Target Population: Populations
Study Population: …and Parameters
Parameter: Samples
Sample: … and Statistics
Statistics Estimates Example
A study of smoking and lung cancer involved
giving cigarets to mice and watching their
health over time. Question
What is the target population?
A)
Mice
B)
People
C)
Unknown
What is the sample?
D)
Mice
E)
People
F)
Unknown Example
A study from 1957 to 1964 in Albany (NY)
and Montreal involving brainwashing and
LSD. The main researcher, Dr. Ewen
Cameron was hired by the CIA to determine
whether or not LSD could be used to
brainwash people. To test these claims he
used psych patients (nonvolunteers) with
minor issues. Question
What is the target population?
A)
People
B)
Psych Patients
C)
Unknown
What is the study population?
D)
People
E)
Psych Patients
F)
Psych Patients from 19571966
G)
Unknown You WILL LOVE STATISTICS You WILL LOVE STATISTICS You WILL LOVE STATISTICS ERRORS
Study Error  ERRORS
Sample Error  ERRORS
Measurement Error  Where are the errors?
Graphically… Parameters and Statistics
Goal of Statistics: MAGIC! Example
Consider the group of people around you.
What is your best guess for their age? The Actual Answer
Using clickers the instructor will now obtain
the classes average age: Example continued
Using clickers, were you right (A) or wrong
(B)? MAGIC!
Your instructors attempt to guess the classes
age….? CI’s The Confidence Interval (C.I.)
Goal: To build an interval for µ using our
sample data.
Assume: µ is unknown, but σ2 known Confidence Level (C.L.)
The confidence level, or CL is the level of
confidence I have that my parameter is in
the interval. Driving Example
Recall the standard normal…
How much of the data is 1 standard deviation
from the mean?
How much of the data is 2 standard
deviations from the mean?
How much of the data is 3 standard
deviations from the mean? Graphically… Confidence Interval
A confidence interval does the same thing
where;
1.
The center is the middle of the
distribution
2.
The C.L. tells me how many standard
deviations we should be from the mean,
denoted by c.
3.
The standard error is the standard
deviation of the value of interest. In General,
If the distribution was…
A~N(B,D)
The confidence interval is: Where…we estimate the unknowns… Specifically…
The distribution of the mean is: Hence the confidence interval is: Steps
1.
2. 3. From the C.L., determine c.
Determine the sample mean (our
estimate) and our variance.
Plug them into the formula. C?????
We determine c by using the confidence
level (CL). We begin by assuming that the
confidence level is the middle probability
about the mean.
i.e. We want Pr(c<Z<c)=CL
Pictorially: Example for c
If the CL is 90%, then c is: Example for c
If the CL is 95%, then c is:
A) 1.28
B) 1.645
C) 1.96
D) Cannot Calculate Example for c
If the CL is 89%, then c is:
A) 1.28
B) 1.6
C) 1.96
D) Cannot Calculate Example
The weekly income of a 1A coop student
was a question on the mind of a student in
grade 12. She knew that σ2 was 12000
dollars2 and that a random sample of 16 of
her friends had an average income of 1000
dollars. What would a 95% confidence
interval be in this case?
We’ll do this question in parts…. Step 1– What is z and –z given
the CL is 95%?
Recall that Z~N(0,1), so we want Pr(z<Z<z)=.95.
A) 1.64
B) 1.65
C) 1.96
D) 2 Step 2
The important numbers/statistics are: The confidence interval is: Recall CIs Example
Suppose we were looking at the average age
of students in class and found it was, for 7
students on average19 with population
standard deviation of 2 years. Build a 95%
CI. Steps 1
1. 2.
3. Find your diamonds (CL, mean, st. dev
etc)
Draw a picture for your CL – to find c.
Fill the values into your CI. Step 1
Find your diamonds
(CL, mean, st. dev
etc)
Suppose we were looking at
the average age of
students in class and
found it was, for 7
students on average19
with population standard
deviation of 2 years.
Build a 95% CI. Step 2
Draw a picture for your
CL – to find c.
Suppose we were looking at
the average age of
students in class and
found it was, for 7
students on average19
with population standard
deviation of 2 years.
Build a 95% CI. Step 3
Fill the values into your
CI.
Suppose we were looking at
the average age of
students in class and
found it was, for 7
students on average19
with population standard
deviation of 2 years.
Build a 95% CI. CI
Estimate +/ (table value) x (standard error) Example
What germs are on our hands?
Objective:
This experiment will show the bacteria that
normally are found on our hands from daily
activities. Materials for each student:
Two petri plates containing agar at room
temperature. The maximum number of
plates provided is 50. Procedure:
1. Each student needs two agar plates .
Have the student write his/her name on both
plates and also label one plate "before hand
washing " and the other plate "after hand
washing".
2. Students should NOT wash hands prior to
this experiment and they SHOULD touch
objects about the room as they normally
would during the day. Procedure 2:
3. Instruct each student to open "before hand
washing " petri dish and run fingers gently
across the surface of the agar with unwashed
fingers being careful not to tear agar. It is
important to have them gently rock fingers so
that nails make a light imprint into the agar. Procedure 3:
• • 4. Close first plate and using
proper hand washing technique wash
hands and have them repeat the process
on second petri dish. That plate should be
labeled with name and "after hand
washing".
5. Incubate the plates inverted (agar in top)
at 35C or room temperature until the next
period. (Usually 24 hours at 35C or 48
hours at room temperature.) Procedure 4:
6. Record results:
a. 4+ = maximum growth
b. 3+ = moderate growth
c. 2+ = some growth
d. 1+ = a little growth
e. neg = no growth
7. Compare colonies to Bacterial Growth Chart
and have students compare number of bacterial
colonies before hand washing to after hand
washing. Data and Statistics
Consider the plates resulting from the
washed hands only.
It is known that the population variance is 1.3.
The mean bacteria growth on the 50 petri
dishes’ is 1.9. Build a 99% confidence
interval for the mean bacteria growth for
those people in the population who have
washed their hands. Diamond Mining
It is known that the population variance is 1.3. The mean
bacteria growth on the 50 petri dishes’ is 1.9. Build a 99%
confidence interval for the mean bacteria growth for those
people in the population who have washed their hands. Clicker
We should look up,
A)
B)
C)
D) 99%
99.5%
98.5%
None of the above. 99% Confidence Interval Confidence Interval Example
Using Clickers
1. The study population (SP) of interest is
those people who are in class today. The SP mean is: Confidence Interval Example
The SP variancesing Clickers
U is: Confidence Interval Example
Using Clickers
Normally…We would NOT know the SP
mean. Confidence Interval Example
Using Clickers
2. Find a group of 3 people (including
yourself). Calculate the average age of the people in
your group. Confidence Interval Example
Using Clickers
3. Find c for a 95% Confidence Interval. A) 1.645
B) 1.65
C) 1.96
D) 2 Confidence Interval Example
Using Clickers
4. Calculate the confidence interval using: Confidence Interval Example
Using Clickers
Nomenclature…
Let the confidence interval be (L,U). L is the
lower end of the interval and U is the
upper end of the confidence interval.
Define the width to be W=UL. Confidence Interval Example
Using Clickers
5. Class Examples:
L U Is the parameter in the
interval? Confidence Interval Example
Using Clickers
6. A)
B) Was the SP mean in your interval?
YES
NO Confidence Interval Example
Using Clickers
Summary… Interpretation
We say: We mean: A CI is NOT a probability! Blocks Lab
Each group 10 blocks at random, and
calculated an average for those 10 blocks.
We build a CI (95%) for each random
selection.
We count the proportion of times 32 is in the
interval. CI Width Width
Recall: We let the upper confidence limit be
U and the lower confidence limit be L.
The width, W=UL.
U=
L=
W= CI Width c Assuming all else remains equal, if n
increases, what happens to W?
A)
B)
C) W increases
W decreases
W stays the same Example
Assume the mean is 5, sigma =
N=9
N=16
2 and c = 1.96 CI Width c Assuming all else remains equal, if c
increases, what happens to W?
A)
B)
C) W increases
W decreases
W stays the same Example
Assume the mean is 5, sigma =
C=1
C=2
2 and n = 9 CI Width c Assuming all else remains equal, if the CL
increases, what happens to W?
A)
B)
C) W increases
W decreases
W stays the same Example
Assume the mean is 5, sigma =
CL = 90%
CL = 95%
2 and n = 9 CI Width c Assuming all else remains equal, if sigma
increases, what happens to W?
A)
B)
C) W increases
W decreases
W stays the same Example
Assume the mean is 5, sigma
Sigma = 2
Sigma = 3
= ? C=2 and n = 9 CI Width c Assuming all else remains equal, if the
sample mean increases, what happens to W?
A)
B)
C) W increases
W decreases
W stays the same Example
Assume the mean = ?, sigma = 2 C=2
Mean = 5
and n = 9 Mean = 10 Time and Money
When asking a statistical question we could
simply get the largest sample imaginable …
OR… we could find the smallest that will do
the job.
Hopefully this would save time and money. Process
1.
2.
3.
4. Make a pilot study.
From the pilot study determine s.
Use s to find n.
Make a study for the n units. Sample Size Calculations
The Width of our confidence interval is: Rearranging for n gives: Example
Problem: To determine the strength of a particular
glass.
Plan: Break 10 panes of glass (EXPENSIVE).
Measuring the tensile strength of the glass.
Data/Statistics: For the 10 panes of glass it was
determined that the mean tensile strength was 50
with standard deviation 5.
What should the sample size be if we want to be
accurate to +/ 0.1 units, 19 times out of 20? Example
Problem: To determine the amount of toxin in a
particular field.
Plan: 5 holes are dug and the toxin levels are
measured.
Data/Statistics: For the 5 holes the toxin levels
had a mean of 82 and variance of 4.2.
What should the sample size be if we want to be
accurate to +/ 0.5 units, 90% of the time? Answer
A) 45
B) 46
C) 47
D) None of the above. What is (are) the problem(s) with
these CIs? CI if σ2 is Unknown
If we don’t know σ2 then we need to
estimate it. Our estimate of σ2 is s2.
HOWEVER, in estimating σ2 , we no longer
have a normally distributed random variable.
Instead we have a student t. What is (are) the problem(s) with
these CIs? The Student t Distribution Degrees of Freedom and Shape Reading the t Table
1. Let df be the degrees of freedom. Look
up the degrees of freedom in the first column.
2. Let the critical value (i.e. z score from a
normal table) be c. Look for this number in
the row indicated by 1.
3. Look to the intersecting columns. The first
row gives the desired probability, p.
i.e. p=Pr(C<c) Tables Used in Course Example
Find the Pr(T<0.90) if T has 5 degrees of
freedom. Example
Find the Pr(T<3) if T has 6 degrees of
freedom. Example
Find the Pr(T>3) if T has 6 degrees of
freedom.
A) 1% B) 2% C) 1 to 2.5% D) 97.5 to 99 % Example
The probability that T on 8 degrees of
freedom is equal to 0.63.
A)
0.63 B) 0.7357 C) 0.2643 D) 0.37
E) None of the above
• READING T BACKWARDS
1. Let df be the degrees of freedom. Look
up the degrees of freedom in the first column.
2. Let p be the probability of interest. Look
up the probability in the first row.
3. Look to the intersecting row and columns
for the critical value c.
i.e. p=Pr(C<c) Example
Df = 8 and p = 0.975 = Pr(C<c). Then the
critical value is:
A) 2.31 B) 2.36 C) 2.26 D) 1.86 Example
Df = 8 and p = 0.95 = Pr(c<C<c). Then the
critical value is:
A) 2.31 B) 2.36 C) 2.26 D) 1.86 Degrees of Freedom and Shape
and the Table… RECALL = CI if σ2 is Unknown
The confidence interval is: Where c is:
And the degrees of freedom are: Digression – Degrees of
Freedom
Consider the data x, y and z.
Let x = ____________
Let y = ____________
Let the sample mean be =
_______________ Digression – DF continued Example
The weekly income of a 1A coop student
was a question on the mind of a student in
grade 12. She knew that s2 was 12000
dollars squared and that a random sample of
16 of her friends had an average income of
1000 dollars. What would a 95% confidence
interval be in this case? An appropriate conclusion… Tough Questions…1
The interval from the last example was:
This means that the population mean wage is
in this interval.
A) Yes
B) No
C) We don’t know Tough Questions…2
The interval from the last example was: This means that the sample mean wage is in
this interval.
A) Yes
B) No
C) We don’t know Tough Questions…3
The 95% confidence interval from the last
example was: This means that the probability our parameter
is in the interval is 95%.
A) Yes
B) No
C) We don’t know RECALL Example
The weekly income of a 1A coop student
was a question on the mind of a student in
grade 12. She knew that σ2 was 12000
dollars squared and that a random sample of
16 of her friends had an average income of
1000 dollars. What would a 95% confidence
interval be in this case?
The interval was: (946,1054) Comparison
The interval with sigma known: (946,1054)
The interval with sigma unknown: What do you notice and why? Example
Actuaries study insurance. They are
interested in the age at death of a person
because the ability to predict such a thing can
help in determining the price of a policy. If 12
people are observed to have an average age
of death of 73 years with a variance of 4
years, what is the 99% confidence interval for
the mean age at death? Confidence Interval RECALL
The CI is: Estimate +/ c(Standard Error)
Example: The Mean… CI for a Mean  Basic
Assumptions
The mean is unknown
2.
The sample is obtained randomly and
independently
3.
The CLT applies to make the mean
normally distributed
Note:
If sigma is unknown we use a t table.
If sigma is known we use a normal table.
1. Confidence Interval for a
Proportion
Recall: We can approximate a proportion
by… Hence, the CI is:
For c we always use a _____________
table. Example
A sample of 500 nursing applications
included 60 from men. Find the 90% CI of
the true proportion of men who applied to the
program. An appropriate conclusion
(and as it would be stated in a
newspaper) Example
Of the 1000 people surveyed by Statistics
Canada last week 182 were looking for a job.
Build a 99% confidence interval for the
countries unemployment rate. An appropriate conclusion
(and…as stated in a newspaper) CI for a Proportion – Basic
Assumptions
1. We don’t know the proportion.
2. The sample is obtained randomly and
independently
3. The proportion is approximately normally
distributed A Mathematical Interlude Properties of Expectation
Let a be a constant and X a r.v. then
1. E(aX) = aE(X)
2. E(X+b) = E(X) + b
3. E(X+Y) = E(X) + E(Y) Why?
E(X+Y) = E(X) + E(Y)
Example: Let X be the time it takes to go to
UW from WLU and Y the time it takes to go to
WLU from UW. How long on average does it
take me to make the circuit? Properties of Variance
Let a be a constant and X a r.v. then
1. Var(aX) = aVar(X)
2. Var(X+b) = Var(X)
3. IF X and Y are independent then
Var(X+Y) = Var(X) + Var(Y) Explaining the Sampling
Distribution of the Mean Explaining the Sampling
Distribution of the Proportion End of the Interlude Hypothesis Tests (HT)
An Introduction HTs Vs CIs
Both are…. CIs are… HTs are…. Class Brainstorm
What things, concepts etc make up a court
case? Hypothesis Tests (HT)
The following is an allegory to help you understand
HTs.
The Cheat Model. Suppose for a second that there
was some question about the honesty (μo) of a
student (μ).
Two possibilities exist.
Either the student
cheated (Ha: μ ≠ μo)
or they did not (Ho: μ = μo). Clicker Question
Should we treat the student as:
A)
B) innocent until proven guilty
guilty until proven innocent As a result we assume (Ho: μ = μo) is true. The Cheat Model
To make a decision we need to gather
evidence (i.e. draw a sample). Our evidence
includes means, standard deviations etc
accumulated into the ultimate decision, a
pvalue.
On the basis of the evidence we must make a
decision. What kind of decision?
A)
B)
C)
D) Guilty or Innocent
Not Guilty or Innocent
Guilty or Not Guilty
None of the above Steps
We will break the process of an
hyothesis test into several steps:
1.
2.
3.
4. The Hypothesis
The Formula
The Pvalue
The Conclusion (the charge!)
(the evidence!)
(the decision!)
(formal description) The P Value
(Step 3) P value
I have a friend…yes, one…
My friend has a fair coin that he starts to flip. He flips the coin once…. Two times…. Three times…. 10 Times…
And gets tails every single time….
At what point do we begin to doubt the
sincerity of our friend?????????? Probability
What is the probability we see ten tails in a
row…if the coin is fair?? p Value
A p value is the probability that we see a
value as extreme as our estimate if our
parameter is what we hypothesize.
In terms of our coin example… Significance Level
The significance level is the point at which we
decide that what we are seeing is too
improbable for our hypothesis to be true.
Typically it is about 5%.
In other words, because it is so unlikely
(<5%) that we would see 10 heads in a row…
we argue that the hypothesis is wrong. Significance Level – 2 Views
1. Black and White
 The question gives you a significance
level, i.e. 5%. If pvalue < significance level
then reject Ho. Significance Levels – 2 Views
Grey
If the pvalue is p then
2. 5% < p <= 10% there is some evidence against
Ho
1% < p <= 5% there is lots of evidence against
Ho
< 1% there is tons of evidence against Ho Example
The pvalue is 0.04. Do we reject or not?
A)
B)
C) Yes.
No.
Uncertain from the given data. The Hypothesis
(Step 1) The Hypothesis
Ho
– We say “H NOT”.
We call this the Null Hypothesis
Typically this is the “Status Quo”, what is
typical.
From our “cheat model” –
 For the coin example – Ha
We say “Heh”
We call this the alternative hypothesis.
This is typically the hypothesis that is
stated in the question.
From our “cheat model” –
 For the coin example – Ho vs Ha for a Mean Ho Ha Hypothesis Tests (HTs) HT  RECALL
1.a) HT vs CI
1.b) Allegory…. HT  RECALL
2. Pvalue:
Tails Probability 10 0.1% 9 1.1% 8 5.5% 7 17.2% 6 37.7% 5 62.3% 4 82.8% HT  RECALL
3. Hypotheses Ho vs Ha for a Mean Ho Ha What do you notice???? A Special Note On Equality and
Ho 2 Sided vs 1 Sided HTs Examples
1. A)
B)
C)
D) The IQ is recorded of 20 students in
Alberta. A scientist argues that because
the sample average is 105, students in
Alberta are more intelligent than the
average (average = 100).
(Ha: μ ≠ 100) (Ho: μ = 100)
(Ha: μ > 100) (Ho: μ = 100)
(Ha: μ < 100) (Ho: μ = 100)
(Ha: μ =100) (Ho: μ ≠ 100) Examples
2. The typical dog lives to be 14 years old. A
certain breeder loves weiner dogs and
notices that 18 pups have lived for, on
average, 18 years. Does the age of a
weiner dog differ from that of the typical
dog?
A)
(Ha: μ ≠ 14) (Ho: μ = 14)
B)
(Ha: μ > 14) (Ho: μ = 14)
C)
(Ha: μ < 14) (Ho: μ = 14)
D)
(Ha: μ =14) (Ho: μ ≠ 14) Recall!
The next 10 slides are semireview.
HW: We are in the week of Nov 8th. The Hypothesis Test Driving
Example
Problem: My friend tells me they have a
fair coin.
Plan: To test this theory he flips the coin 10
times. He gets a tail every time. I then
record the probability of seeing at least x
tails.
i.e. Pr(X>=x) Distribution of X
A) Binomial
B) Poisson
C) Normal
D) t
E) None of the above. The Hypothesis
Ho: The coin is fair.
Ha: The coin is NOT fair. HT  RECALL
In theory, Pr(X>=x given the coin is fair):
Tails Probability 10 0.1% 9 1.1% 8 5.5% 7 17.2% 6 37.7% 5 62.3% 4 82.8% The Fence (Significance Level)
We argue that 5% is ‘small enough’. If an
event is more rare than 5%, we reject our
assumption. Where should the fence go…?
Tails Probability 10 0.1% 9 1.1% 8 5.5% 7 17.2% 6 37.7% 5 62.3% 4 82.8% A) 6 B) 7 C) 8 D) 9 E) 10 Conclusion
If we see 10 tails in a row, we
A)
Reject the coin being fair
B)
Not reject the coin being fair Formally….
An hypothesis involves 4 steps:
1.
Hypothesis
2.
Formula
3.
P value
4.
Conclusion Step 3. P value
A p value is the probability that we see a
value as extreme as our estimate if our
parameter is what we hypothesize. Step 1. Ho vs Ha for a Mean Ho Ha Return to your regularly
scheduled slides…. The Formula
(Step 2) The Formula
RECALL!
Let X be N(10,25). If we want to find the
probability that X is more than 15 then… What does the z score
represent??? The Formula
The formula used in hypothesis tests is
always the same.
The formula is: We call this a discrepancy. Typical Examples
For a mean (sigma known), the sampling
distribution is: Therefore the hypothesis formula is: Typical Examples
For a mean (sigma unknown), the sampling
distribution is: Therefore the hypothesis formula is: P value and Ha
Ha P value The Conclusion
(Step 4) The Conclusion
In a court case which do we say…?
A)
B)
C)
D)
E) Guilty
Innocent
Not Guilty
Not Innocent
C or A RECALL Steps
We will break the process of an hyothesis
test into several steps:
1.
2.
3.
4. The Hypothesis
The Formula
The Pvalue
The Conclusion (the charge!)
(the evidence!)
(the decision!)
(formal description) Hypothesis Tests
Putting it all together… Pvalues, Court Cases etc…
On Trial: The evidence: The decision: The conclusion: A Picture to Put it Together… Particular Recipe for HT
Sigma Known
1. Determine the hypotheses
2. Use the formula: 3. Calculate a p value: 4. Make a conclusion Hypothesis Tests (HT)
Methodology Clicker
The following are possible hypotheses:
A)(Ha: μ ≠ 100) (Ho: μ = 100)
B)(Ha: μ > 100) (Ho: μ = 100)
C)(Ha: μ < 100) (Ho: μ = 100)
D)(Ha: μ =100) (Ho: μ ≠ 100)
E)A, B and C Clicker
A pvalue is:
A)A probability we see an estimate this far from
the hypothesis if Ha is true.
B)A probability we see an estimate this far from
the hypothesis if Ho is true.
C)A probability we see an parameter this far from
the hypothesis if Ha is true.
D)A probability we see an parameter this far from
the hypothesis if Ho is false. Clicker
In the conclusion stage we never say:
A)Reject Ha
B)Accept Ho
C)Accept Ha
D)All of the above General Recipe for HT
1. Determine the hypotheses
2. Use the formula
3. Calculate a p value
4. Make a conclusion Particular Recipe for HT
Sigma Known
1. Determine the hypotheses
2. Use the formula: 3. Calculate a p value: 4. Make a conclusion Example
The number of hours of TV watched by American
students per week is known to be 24 with a
standard deviation of 3 hours. Canadian
researchers believe that the same standard
deviation holds for Canadians. Further, in a
survey of 400 Canadian students the average TV
hours watched per week was 23.8. Are Canadian
researchers right in assuming that Canadian
students watch less TV? Use a 5% level of
significance to make your decision. Clicker
Ha is (where A
is a number):
A)
(Ha: μ ≠ A)
B)
(Ha: μ > A)
C)
(Ha: μ < A)
D)
(Ha: μ = A) The number of hours of TV
watched by American students per
week is known to be 24 with a
standard deviation of 3 hours.
Canadian researchers believe that
the same standard deviation holds
for Canadians. Further, in a survey
of 400 Canadian students the
average TV hours watched per
week was 23.8. Are Canadian
researchers right in assuming that
Canadian students watch less TV?
Use a 5% level of significance to
make your decision. Step 1  Hypothesis Step 2  Formula Clicker
The p value is:
A)
Pr(D>d)
B)
Pr(D<d)
C)
2Pr(D>d)
D)
Pr(D=d) The number of hours of TV
watched by American students
per week is known to be 24 with
a standard deviation of 3 hours.
Canadian researchers believe
that the same standard deviation
holds for Canadians. Further, in
a survey of 400 Canadian
students the average TV hours
watched per week was 23.8. Are
Canadian researchers right in
assuming that Canadian
students watch less TV? Use a
5% level of significance to make
your decision. Step 3 – P Value Clicker
Therefore we:
A)
Reject the null
B)
Reject the alternative
C)
Accept the null
D)
Accept the alternative
E)
Do not reject the null Step 4  Conclusion Particular Recipe for HT
Sigma UNknown
1. Determine the hypotheses
2. Use the formula: 3. Calculate a p value: 4. Make a conclusion Example
8 Northern Ontario Lakes pH levels are
measured. The average is 6.4 with a standard
deviation of 0.7. The researchers suspect that
based on this information, the pH level differs from
neutral. Clicker
Ha is (where A
is a number):
A)
(Ha: μ ≠ A)
B)
(Ha: μ > A)
C)
(Ha: μ < A)
D)
(Ha: μ = A) 8 Northern Ontario Lakes
pH levels are measured.
The average is 6.4 with a
standard deviation of 0.7.
The researchers suspect
that based on this
information, the pH level
differs from neutral. Step 1  Hypothesis Step 2  Formula Clicker
The p value is:
A)
Pr(D>d)
B)
Pr(D<d)
C)
2Pr(D>d)
D)
Pr(D=d) 8 Northern Ontario Lakes
pH levels are measured.
The average is 6.4 with
a standard deviation of
0.7. The researchers
suspect that based on
this information, the pH
level differs from neutral. Step 3 – P Value Clicker
Therefore we:
A)
Reject the null
B)
Reject the alternative
C)
Accept the null
D)
Accept the alternative
E)
Do not reject the null Step 4  Conclusion Example – Clicker Question
There has been some talk in the media of Vitamin
D (VitD) preventing colds (there is some biological
logic to this…). A study was conducted of those
who take VitD versus those that did not. All
patients initially had no colds. The length of time
before they got their first cold was recorded. Example Continued
In the VitD group, of 14 people, the average
length of time before their first cold was 23 days
with a standard deviation of 12 days. A medical
doctor suggests that the information is not
significant unless the average number of days is
greater than 25.
The hypothesis tests are:
A)(Ha: μ ≠ 25) (Ho: μ > 25)
B)(Ha: μ > 23) (Ho: μ = 23)
C)(Ha: μ < 25) (Ho: μ = 25)
D)(Ha: μ > 25) (Ho: μ = 25 Example Continued
In the VitD group, of 14 people, the average
length of time before their first cold was 23 with
a standard deviation of 12. A medical doctor
suggests that the information is not significant
unless the average days is greater than 25.
The discrepancy is:
A) 2.16
B) 0.624
C) 0.624
D) 2.16 E) None of the above. Pvalue
The pvalue is [next two slides are tables]:
A) 73.24%
B) 25.76%
C) 20% to 30%
D) 70% to 80% Therefore we:
A)Reject Ho
B)Accept Ho
C)Reject Ha
D)Do not reject Ho
E)Accept Ha Hypothesis Tests (HT)
Errors Clicker Question
In an hypothesis test of grades we are testing
whether the class average is less than 75.
We reject Ho. What does this mean…? We
are 100% certain that…
A) The class average is less than 75
B) The class average is more than 75
C) We cannot be 100% certain Errors
Notice that we aren’t certain…hence it is
possible that we are wrong. How??
Test→
Truth ↓ Test Rejects
Ho Ho is
TRUE Type 1 Error Ho is
FALSE Test Does Not
Reject Ho Type 2 Error Type 1 Errors
We want to reduce the possibility of a type 1
error.
In terms of our cheating case this means we
want to reduce the probability that… Type 2 Errors
Sadly we can only reduce Type 1 OR Type 2
errors, not both…
Hence we purposely set the possibility of a
type 1 error to nothing…sacrificing the
possibility of a type 2 error. To Reduce the Possibility of a
Type 2 Error… Clicker Test
We’re testing whether or not Robert
Pattinson has more fans than the average
actor. We reject Ho BUT WE’RE WRONG!
What kind of error have
we made?
A)
Type 3 B) Type 1
C) Type 2 D) NO ERROR! RECALL  Errors
Test→
Truth ↓ Test Rejects
Ho Ho is
TRUE Type 1 Error Ho is
FALSE Test Does Not
Reject Ho Type 2 Error Practical vs Statistical
Significance
When we reject Ho, we say the result is
“Statistically Significant”.
This result may have no practical
significance. That requires the knowledge of
a subject matters expert (i.e. medical doctor,
engineer, …) Silly Simile Example
In calculating your taxes you find that on
average you owe $204.352 to the
government.
In fact the value is STATISTICALLY more
than $204.35.
Does this hold PRACTICAL significance? Although that example was silly…. Cis vs Hts
In some cases a CI and a 2 sided HT can be
exchanged and will give the same result.
But these times are rare and difficult to
define. Hence, don’t exchange them! Pictorially HTs for Proportions
Hypotheses: Formula: The Table: A) Normal B) T C) Mahogony Example
Market Research Inc wants to know if
shoppers are sensitive to the prices of items
sold in a supermarket. It obtained a random
sample of 802 shoppers and found that 378
shoppers were able to state the correct price
of an item immediately after putting it into
their cart. Test at the 7% level the hypothesis
that at least onehalf of all shoppers are able
to state the correct price. Clicker
Ha is (where A is a number):
A)
(Ha: p ≠ A)
B)
(Ha: p > A)
C)
(Ha: p < A)
D)
(Ha: p = A) The Hypothesis The Formula Clicker
The p value is:
A)
Pr(D>d)
B)
Pr(D<d)
C)
2Pr(D>d)
D)
Pr(D=d) The P Value Clicker
Therefore we:
A)Reject Ho
B)Accept Ho
C)Reject Ha
D)Do not reject Ho
E)Accept Ha The Conclusion Question
If we made a mistake, what kind of mistake
would it be…?
A)
B)
C)
D)
E) Type 1
Type 2
Type 1 and 2
Type 3
We did not make a mistake RECALL  Errors
Truth\Tes
t Test Rejects
Ho Ho is
TRUE Type 1 Error Ho is
FALSE Test Does Not
Reject Ho Type 2 Error 3 Hypothesis Tests
and Confidence Intervals
HTs
Mean
(sigma known)
Mean
(sigma unknown) Proportion CIs Liberals and Conservatives
(Democrats vs Republicans)
In a local newspaper, The Record, the
following article was headlined: “LIBERAL
SUPPORT DWINDLING”.
The newspaper states: Newspaper Article
Liberal support, from a sample of 400 people
this week is at 33% plus or minus 4%, 19
times out of 20.
This is a drastic drop from last month when
support was at 35%.
Test this assumption… The “DIAMONDS” Clicker
Ha is (where A is a number):
A)
(Ha: p ≠ A)
B)
(Ha: p > A)
C)
(Ha: p < A)
D)
(Ha: p = A) The Hypothesis The Formula Clicker
The p value is:
A)
Pr(D>d)
B)
Pr(D<d)
C)
2Pr(D>d)
D)
Pr(D=d) The Pvalue Clicker
Therefore we:
A)Reject Ho
B)Accept Ho
C)Reject Ha
D)Do not reject Ho
E)Accept Ha The Conclusion Question
If we made a mistake, what kind of mistake
would it be…?
A)
B)
C)
D)
E) Type 1
Type 2
Type 1 and 2
Type 3
We did not make a mistake RECALL  Errors
Truth\Tes
t Test Rejects
Ho Ho is
TRUE Type 1 Error Ho is
FALSE Test Does Not
Reject Ho Type 2 Error Comparing Two Samples
What if we want to compare two groups?
Males to Females
New Drug to Old Drug
Salmon and Perch
Tech stocks and Financial Stocks Comparing Two Groups
How we compare those groups depends on
whether or not they are dependent or
independent…and the kind of study we are
performing… Problem Segway: Variates
Response Variate: Explanatory Variate: Focal Explanatory Variate: Example – Students Grades
Problem: Compare Female to Male Grades
Response: Grade on Midterm
Explanatory Variate: Focal Explanatory Variate: Studies
2 types:
Experimental Observational Dependent Groups
There exists a relationship between
individuals in the groups, either
real or artificial.
Examples:
1. Twins Examples:
2. Same Units Twice 3. Artificial Twins Independent Groups
Independent groups are groups that we
assume have no relationship between the
individuals in the groups.
In other words, Study vs Dependent
If our study is observational we “match”
similar individuals (units) by explanatory
variates.
If our study is experimental we “block”
similar individuals (units) by explanatory
variates.
In both cases we call this process “pairing”. Example
50 fish are caught from a stream. 20 of them
are placed in a low pH solution and 30 are
placed in a high pH solution.
Is this…?
A) Experimental
B) Observational
Are the groups…?
A) Independent B) Dependent Clicking Nomenclature
i.e. How good is your memory…?
A Farmer wants to test the effectiveness of a
new fertilizer. She decides to break her field
into 1 acre plots and randomly spread the
new fertilizer on 8 of the plots while using
the remaining 27 plots to spread the old
fertilizer. s Clicking Nomenclature
i.e. How good is your memory…? t
A
h Farmer wants to test the effectiveness of a
i new fertilizer. She decides to break her field
i
snto 1 acre plots and randomly spread the
new fertilizer on 8 of the plots while using
tthe remaining 27 plots to spread the old
f
eertilizer.
s
t
c
o s Clicking Nomenclature
i.e. How good is your memory…? t
A
h Farmer wants to test the effectiveness of a
i new fertilizer. She decides to break her field
i
snto 1 acre plots and randomly spread the
new fertilizer on 8 of the plots while using
t
s he remaining 27 plots to spread the old
tfertilizer.
u
d
y
e h
e Clicking Nomenclature
i.e. How good is your memory…? fA Farmer wants to test the effectiveness of a
n
o ew fertilizer. She decides to break her field
i
cnto 1 acre plots and randomly spread the
n
a ew fertilizer on 8 of the plots while using
l the remaining 27 plots to spread the old
fertilizer.
e
x
p
l
a A Very Basic and Very Good
Experimental Design
Step 1 – Pairing (Blocking/Matching)
Step 2 – Randomization
Step 3 – Replication (i.e. Repeat 1,2) Explanation of the Next Few Slides Example – Pairing
The following shapes/colours are people. Example – Randomization
The following shapes/colours are people. Summary of the Blocking
Process
1.
2. We pair our units up.
We flip a coin and put one unit of the pair
in group 1. The other unit goes to group 2. 2 (In)Dependent Groups Picture
(with notation) Example  Differencing
18
32 25 19 30 30 2 Dependent Group Differences
Group 1
Unit 1
Unit 2
Unit 3
Variance Group 2 Difference Why does this happen????
Consider the fake experiment where we
know that the following is exactly true:
Weight (lbs)
= Height (cms) + Age (years) + Gender
Our response varies due to our explanatory
variates.
Gender is 10 if male, 0 if female. Paired
Now suppose we match by Age (A1=A2)
and Gender (G1=G2)…Then the difference
in weight is only related by HEIGHT.
W1 = H1 + A1 + G1
W2 = H2 + A2 + G2 HT: for a mean difference with
dependent units
1. Determine the hypotheses 2. Use the formula: 4.
5. Calculate a p value
Write a conclusion CI: for a mean difference with
dependent units
Formula: Example
9 students speed is measured before and
after inebriation. The difference (after –
before) is determined to be on average 18
km/h with a standard deviation of the
differences of 7 km/h. Build a 95%
confidence interval for the difference in
speed. Formula Conclusion Example
9 students speed is measured before and
after inebriation. The difference (after –
before) is determined to be on average 18
km/h with a standard deviation of the
differences of 7 km/h. Test whether the
difference in speeds of an inebriated person
with that of a sober individual is different from
zero. Clicker
The hypothesis is:
A) Ha: µ < A
B) Ha: µ > A
C) Ha: µ ≠ A
D) Ha: µ ≤ A
E) Ha: µ ≥ A The Hypothesis The Formula Clicker
The Pvalue is:
A)
Pr(D>d)
B)
Pr(D<d)
C)
2Pr(D<d)
D)
2Pr(D>d)
E)
2Pr(D<d) The Pvalue Clicker
A)
B)
C)
D)
E) Reject Ho
Do Not Reject Ho
Accept Ho
Do not reject Ha
Reject Ha Conclusion The Prior Example
This was an example of a PAIRED (matched
or dependent) situation… Sample Example
As a researcher for the Ontario Ministry of
Environment, you have been asked to
determine if Ontario’s air quality index (AQI)
has changed in the past 2 years. You select
a random sample of 10 cities and find the air
quality on the same day in 2 consecutive
years.
A comparison was made. Are the samples…
A) Independent
B) Dependent Sample Example
A survey found that the average hotel rate in
Toronto was $175.53 and the average rate in
Vancouver is $171.31. Assume that the data
were obtained from two samples of 50 hotels
each with standard deviation of 9.52 and
10.89 respectively.
A comparison was made. Are the samples…
A) Independent
B) Dependent Independent Notational Picture Independent Samples Picture Independent Samples – CI
Formula: Independent Samples – HT
1. Hypotheses: 2. Formula: 3.
4. Pvalue
Conclusion NOTE
1. 3. In class I will only cover the case where
the standard deviations are assumed to
be different. If we have evidence to
assume the two are the same a different
formula (involving sp) would be needed!
Degrees of freedom in this case are: CAREFUL!
How you define your Ha affects:
1.
P value
2.
The calculation of d
IF you change the direction of your Ha then
everything else must change as well.
Your final answer WILL NOT change. Example
A researcher hypothesizes that the average
number of sports that colleges offer for males
is greater than the average number of sports
that colleges offer for females. In both cases
there were 50 colleges. A sample of the data
is given below:
Sample Mean (males) = 8.6
Sample Mean (females) = 7.9
Standard deviation (males) = 3.3
Standard deviation (females) = 3.7 Comparison
A comparison was made. Are the samples…
A) Independent
B) Dependent
Perform your comparison at a significance
level of 10%. Clicker
The hypothesis is:
A) Ha: µm  µf < A
B) Ha: µm  µf > A
C) Ha: µm  µf ≠ A
D) Ha: µm  µf ≤ A
E) Ha: µm  µf ≥ A Hypothesis Formula Clicker
The Pvalue is:
A)
Pr(D>d)
B)
Pr(D<d)
C)
2Pr(D<d)
D)
2Pr(D>d)
E)
2Pr(D<d) Pvalue Clicker
A)
B)
C)
D)
E) Reject Ho
Do Not Reject Ho
Accept Ho
Do not reject Ha
Reject Ha Conclusion Example
A researcher hypothesizes that the average number of
sports that colleges offer for males is greater than the
average number of sports that colleges offer for females.
In both cases there were 50 colleges. A sample of the
data is given below:
Sample Mean (males) = 8.6
Sample Mean (females) = 7.9
Standard deviation (males) = 3.3
Standard deviation (females) = 3.7
Build a 90% confidence interval for the difference. Math Conclusion Clicker Question
What was your grade (%) on Test 2?
A)
90 – 100
B)
85 – 89
C)
70 – 84
D)
55 – 69
E)
< 55 Picture for 2 Proportions
(Notation) Difference in Proportions – CI
Formula: Difference in Proportions  HT
Estimated value for p:
1. Hypothesis (ONLY 1): 2. Formula: 3.
4. Pvalue
Conclusion Example
(Statistics in the Classroom!)
The example today will involve using
something called a POG. POG is short for
“poggendorf”. The Poggendorf
a b c The Goal…
To visually, without using a straight edge, just
the naked eye, guess where a will strike c. Problem…?
Are evil or good classes better at seeing
through the trick????? Data
a b c BEWARE!
New Slides Ahead Clicker
The hypothesis is:
A)
Ha: pe  pg < A
B) Ha: pe  pg > A
C) Ha: pe  pg ≠ A
D) Ha: pe  pg ≤ A
E) Ha: pe  pg ≥ A Hypothesis Formula Clicker
The Pvalue is:
A)
Pr(D>d)
B)
Pr(D<d)
C)
2Pr(D<d)
D)
2Pr(D>d)
E)
2Pr(D<d) Pvalue Clicker
A)
B)
C)
D)
E) Reject Ho
Do Not Reject Ho
Accept Ho
Do not reject Ha
Reject Ha Conclusion Relationships
Ch. 12.5 A Measure of Relationship
Often we want to look at the relationship
between two numbers.
For example:
Gender and Age at Death
Amount of sleep and a midterm grade
Lung cancer and smoking Two Measure of Relationship
1.
2. Correlation (and covariance)
Slope Correlation/Covariance Notation
We use the letter “r” to denote correlation. At
times will we write rxy to denote the
correlation between x and y. Correlation Interpretation
The correlation is a number between 1 and
1.
There are 2 important features about this
number.
a)
Magnitude – the size of the number
b)
Direction – the sign of the number Magnitude
The closer r is to 1 the stronger the
relationship. Graphically this is indicated by a
tightness in the data about a line.
If r=1 we call it a perfectly linear relationship.
The closer r is to zero, the more randomly
scattered and less linear the graph. Correlation of 1 Correlation of 0.9 Correlation of 0.5 Correlation of 0 Direction
The sign of r indicates the direction.
A positive r indicates that the points have a
positive slope.
A negative r indicates that the points have a
negative slope. Correlation of 0.9 Correlation of 0.9 Clicker Correlation
For the graph to the right,
the correlation is:
A)
Large and negative
B)
Small and positive
C)
Large and positive
D)
Small and negative
E)
None of the Above Other Datasets Linear Relationships Only!!
The only relationships studied by correlation
is are linear. Correlation cannot study other
types, say quadratic (as in the next example). A Quadratic Relationship
X=3,2,1,0,1,2,3
Y=9,4,1,0,1,4,9 (i.e. Y=X2)
rxy=0 Summary Correlation
A number from 1 to 1 indication the strength
(maginitude) and direction (positive or
negative) of a relationship. Clicker Test 1
The Correlation is
A)
Positive and weak
B)
Negative and weak
C)
Positive and strong
D)
Negative and strong
E)
None of the above Clicker Test 2
The Correlation is
A)
Positive and weak
B)
Negative and weak
C)
Positive and strong
D)
Negative and strong
E)
None of the above Clicker Test 3
The Correlation is 0
which means
A)
No relationship
B)
No linear relationship
C) Weak relationship
D) None of the above What is a “STRONG”
relationship?
If r = +.70 or higher Very strong positive relationship
+.40 to +.69 Strong positive relationship
+.30 to +.39 Moderate positive relationship
+.20 to +.29 weak positive relationship
+.01 to +.19 No or negligible relationship
.01 to .19 No or negligible relationship
.20 to .29 weak negative relationship
.30 to .39 Moderate negative relationship
.40 to .69 Strong negative relationship
.70 or higher Very strong negative relationship Covariance
(Correlations Useless Cousin)
Notation:
The Covariance is denoted by sxy.
Purpose:
Covariance is more useful from a
statisticians perspective.
We use sxy to calculate r. No Magnitude, Just Direction
With a covariance the magnitude is NOT
important. It can have values from minus
infinity to positive infinity and the size of the
number is meaningless.
Only the direction can be determined and is
based on the sign. Clicker Covariance
For the graph to the right,
the covariance is:
A)
large
B)
positive
C)
small
D)
negative
E)
None of the Above Example:
Women Heights and Weights
Description: The heights and weights of
women aged 30 to 39. Data
women
height weight
1
58 115
2
59 117
3
60 120
4
61 123
5
62 126
6
63 129
7
64 132
8
65 135
9
66 139
10 67 142
11 68 146
12 69 150
13 70 154
14 71 159
15 72 164 Scatterplot Covariance, Correlation
Covariance = 69
Correlation = 0.995 How do we calculate a
Covariance…?
sxy= WHY??? How do we calculate a
Correlation…?
rxy= Why? Example
Consider the data:
X = {1,2,3}
Y= {3,2,1}
What is the covariance??
What is the correlation?? Covariance Correlation Causation
An implication that Y changes due to X.
e.g. Smoking Causes Lung Cancer
e.g. Lack of sleep causes poor grades.
Causation is NOT correlation.
The following is a proof by one example… Example
Problem: Does smoking cause lung cancer?
Plan: The age at death “Agedeath” was
recorded for people who smoked a number
equal to “cigs”/day and owned a number,
“Lighters”, of lighters. The gender was also
recorded where 1=male, 0=female. Data
AgeDeath Cigs Lighters Gender
1
65 0
1
0
2
42 20
8
1
3
82 0
2
0
4
55 15
6
1
5
60 20
9
0
6
57 0
2
1
7
64 10
5
0
8
78 0
0
1
9
95 0
2
0
10
39 40
8
1
11
52 30
5
0
12
49 25
9
1 Plots Clicker
Based on the plots.
1.
The correlation between cigs and lighters
is:
A)
Positive
B)
Negative
C)
No linear relationship. 2. The correlation between age at death and
cigs is:
A)
Positive
B)
Negative
C)
No linear relationship. Correlations
AgeDeat
Cigs
h AgeDeath 1.00 0.79 Gende
r
Lighters 0.72 0.51 Cigs 0.79 1.00 0.83 0.25 Lighters 0.72 0.83 1.00 0.24 Gender 0.51 0.25 0.24 1.00 What do you notice? A)
B)
C) The correlation between age at
death and lighters is:
Positive
Negative
No linear relationship.
AgeDeath Cigs Lighter
s Gender 1.00 0.79 0.72 0.51 Cigs 0.79 1.00 0.83 0.25 Lighters 0.72 0.83 1.00 0.24 Gender 0.51 0.25 0.24 1.00 AgeDeath So if correlation implies
causation then what causes
cancer…..
Lighter
AgeDeath Cigs s Gender 1.00 0.79 0.72 0.51 Cigs 0.79 1.00 0.83 0.25 Lighters 0.72 0.83 1.00 0.24 Gender 0.51 0.25 0.24 1.00 AgeDeath Regression Two Measure of Relationship
1.
2. Correlation (and covariance)
Slope BUT how do we get the slope of a set of
points…we need a method to build a line
in the points.
For this we use regression. Regression Line
We wish to build a regression line (or line of
best fit), which is a line thru our data.
As with any line it will have a slope b1 and an
intercept, b0.
i.e. y=mx+b => y= b0+b1x
There are many ways to do that. Consider the data Different Lines One Way to Build a Line is to
Use Regression
Concept: We want to minimize the vertical
distance between our observed points and
the line.
Picture: Y
value Distance:
Residual=r=yy^ y^ Residuals
Denote by y(x), a response for explanatory
variate x.
Denote by y^(x), a response on the line for
explanatory variate x.
Denote by r=y(x)y^(x) a residual. The Sum of Residuals are Zero
The sum: Hence…
We want to minimize the squared residuals,
∑r2. Clicker Question
Which line is the one chosen by regression:
A)
Y=12x with residuals (8,0,8)
B)
Y=4+2x with residuals (2, 4, 6)
C)
Y=3x with residuals (1, 4, 5)
D)
Y=1x with residuals (6,2,4) Example
Problem: To investigate the relationship
between heights and weights of women. Data
height weight
1
58 115
2
59 117
3
60 120
4
61 123
5
62 126
6
63 129
7
64 132
8
65 135
9
66 139
10 67 142
11 68 146
12 69 150
13 70 154
14 71 159
15 72 164 Graphed The Line of Best Fit
(Regression Line)
Y=87.52+3.45x Predictions (Extrapolation)
As a notation we’ll use y^ as a prediction.
Hence our predictions are on the line.
We use the line to help us make our
predictions.
Hence: y^= 87.52 + 3.45x Example
Predict the height of a woman who weighs
200lbs. Formulas
y= b0+b1x
b0= b1= OR rs y/sx Example: Consider the Data
Data
X
Y
1
3
2
2
3
1 Does a Relationship Exist??
To test whether or not a relationship exists we
can perform two tests:
1. We can test to see if r=0
2. We can test to see if b1=0
Both tests give the same results, so we will
test r. This means that…
If we find a positive correlation, we have a
positive slope
2.
If we find a negative correlation we have a
negative slope
AND
3. If we have no correlation, we have no
slope.
1. Hypothesis Test for p (rho)
1. Hypothesis:
We use the notation p (rho) for a correlation
from the population. Hence we ask, is p=0??
Ho: Ha: 2. Formula:
d = r√[(n2)/(1r2)]
This has a t distribution on n2 degrees of
freedom.
Pvalue: Same
Conclusion: Same Example
In one of my courses there are 120 students.
The correlation between their midterm mark
and their clicker mark is 0.4. Is this
significant? Clicker
The hypothesis is:
A)
Ha: p < A
B) Ha: p > A
C) Ha: p ≠ A
D) Ha: p ≤ A
E) Ha: p ≥ A Hypothesis Formula Clicker
The Pvalue is:
A)
Pr(D>d)
B)
Pr(D<d)
C)
2Pr(D<d)
D)
2Pr(D>d)
E)
2Pr(D<d) Pvalue Clicker
A)
B)
C)
D)
E) Reject Ho
Do Not Reject Ho
Accept Ho
Do not reject Ha
Reject Ha Conclusion What can we say…?
Clicker point and hence attending class
A) leads to higher marks.
B) does not improve marks.
C) are positively correlated.
D) are not correlated Hence what can we say about
the slope…?
The slope is
A) Negative
B) Positive
C) Non existant Example
The correlation between number of midterm
absences and clicker mark were also
recorded for 7 students. The correlation was
0.944. Perform a test to see if this is
significant. The Hypothesis
What should the hypothesis be??
A) Ho: μ=0
Ha: μ≠0
B) Ho: μ ≠ 0
Ha: μ=0
C) Ho: p =0
Ha: p ≠ 0
D) Ho: p ≠ 0
Ha: p=0 The Formula
The value for the formula is:
A) 6.4
B) 6.4
C) 1.51
D) 1.51 Degrees of Freedom
The degrees of freedom are:
A) 7
B) 6
C) 5
D) 4
E) None of the above The Pvalue Conclusion Correlation Coefficient
vs
Coefficient of Determination
The correlation coefficient is r.
The coefficient of determination is r2. The Coefficient of Determination
The coefficient of determination is a measure
of how good our model is. In other words, it
is a measure of how tight our points are
about the line.
Value: Interpretation
Rule of thumb
0 < r2 < 0.3
=> weak
0.3 < r2 < 0.7
=> moderate
r2 > 0.7
=> strong
The coefficient of determination tells us… Example
Problem: To investigate the relationship
between heights and weights of women. Data
height weight
1
58 115
2
59 117
3
60 120
4
61 123
5
62 126
6
63 129
7
64 132
8
65 135
9
66 139
10 67 142
11 68 146
12 69 150
13 70 154
14 71 159
15 72 164 Graphed Important Stats
Slope: y^= 87.52 + 3.45x
Correlation Coefficient: 0.995
Coefficient of Determination: (0.995)2=0.99 Example
The speed of a car and the distance it takes
to stop are recorded: The data
> cars
speed dist
1
42
2
4 10
3
74
4
7 22
5
8 16
6
9 10
7 10 18 Important Stats
Slope: y^= 17.579 + 3.932x
Correlation Coefficient: 0.81
Coefficient of Determination: (0.81)2=0.6561 This means…
A)
B)
C) D)
E) The relationship is strong
The relationship is positive
The amount of variability explained by the
model is good
All of the above
None of the above Causation
Causation is NOT association…
So how do we prove something causes
something else….x causes y? Experimental Studies
In an experimental study, we can:
1. Block units together according to all
explanatory variates except the focal 2. Randomly determine which member of the
pair goes into which group 3. Use more than 1 pair (lots and lots!). We
call this replication. 4. Measure the response in each group. If
there is a difference in the average
response it is due to the focal. But…to prove causation…
To prove causation in the last slide we want
to apply it to every member of the population.
Which is NOT likely…
KEYS: Blocking, Randomization and
Replication Observational Studies
Problems –
We may NOT be able to block…nor
randomize… Farewell’s Criterion
1. A relationship to be observed in many
studies of different types, in different
settings 2. A relationship should hold when other
plausible variates are controlled (e.g.
lighter example) 3. A plausible scientific explanation is
required for the direct influence of x on y
and no other strong explanations 4. There must be a consistent doseresponse relationship. Good Luck on your Finals! Midterm 2
review Some Review…Covered in
TUTORIAL
Understandably this stuff is hard. So let’s put
it together. Sampling Distributions
A single value –
A mean –
A total –
A proportion –
Count Data – Which one(s) involve a continuity
correction…?
A)
B)
C)
D)
E) Mean
Total
Count
Proportion
C and D The Tricks
Decide what the question wants. Often it
gives hints like “average”, “total”, …
2.
Use the appropriate sampling distribution
to standardize
Assuming z is positive,
3.
If you have Pr(Z<z) set equal to Pr(Z>z)
4.
If you have Pr(Z>z) set equal to 1Pr(Z<z)
1. Example
The number of bags lost at an airport terminal
on a particular day is reported to be 15 on
average per day with standard deviation 3;
find… A) The probability that in 1 day 17 bags are
lost. B) The probability that in 5 days 60 bags are
lost. B) The probability that in 5 days an average
of 10 bags are lost. Confidence Intervals
EXPLORATORY!
Estimate +/ c(SE)
If sigma is known:
If sigma is unknown:
If we are dealing with a proportion: Clicker
In the question that follows is it a CI for:
A) mean, sigma known
B) mean, sigma unknown
C) proportion
D) sigma, mean known
E) sigma, mean unknown Example
The number of bags lost at an airport terminal
on a particular day is reported to be 15 on
average per day with standard deviation 3;
find a 95% confidence interval for the number
of lost bags per day. Hypothesis Tests
CONFIRMATORY
4 steps:
1. Hypothesis
2. Formula
3. Pvalue
4. Conclusion Hypothesis Tests
4 steps – Mean (sigma known)
1. Hypothesis
2. 3.
4. Formula Pvalue
Conclusion Hypothesis Tests
4 steps – Mean (sigma unknown)
1. Hypothesis
2. 3.
4. Formula Pvalue
Conclusion Hypothesis Tests
4 steps – Mean (proportion)
1. Hypothesis
2. 3.
4. Formula Pvalue
Conclusion Example
The number of bags lost at an airport terminal
on a particular day is reported to be 15 on
average per day with standard deviation 3;
use a significance level of 6% to test whether
the average per day differs from 14. Final Exam
Review Final Exam Review
•
• Agenda
My questions
– • I have put them together at the start.
Please try them before the help session
on your own, without scolling down
and with your text book closed. Your questions Example 1
Scientists are curious about CO2 levels and acid
rain. 120 areas are measured for CO2 and
acidity. It is found that the standard deviation of
CO2 and acid levels are 7 and 0.3 respectively.
Further, the covariance between CO2 and acidity
is 1.
a) Interpret the covariance.
b) What is the coefficient of determination?
c) Test whether the slope of the linear relationship
between CO2 and acidity is less than zero. Example 2
Scientists believe that elephants who ended their lives in
captivity live shorter lives than those kept in the wild. The
length of time the elephant lives is recorded. In each of the
following situations, build an appropriate hypthesis test to
test the following situations:
A. Several groups of elephants are selected. The first
group, comprised of 10 elephants who have lived in the wild
their entire lives. We compare this group to the second
group of 10 elephants who have lived in a zoo their entire
lives. A third group of 14 elephants who started their lives in
the wild and ended their lives in captivity are compared to a
group of 14 elephants who started their lives in captivity and
were released to the wild. If w, d and c denote “wild”,
“difference” and “captivity” then: sw=4, sc=6, sd=3, xw=68,
xc=65, xd=3, Example 2
Scientists believe that elephants who ended their lives in
captivity live shorter lives than those kept in the wild. The
length of time the elephant lives is recorded. In each of the
following situations, build an appropriate hypthesis test to
test the following situations:
B. 10 zoos were selected at random. From each zoo two
elephants from the same litter were selected. One of the
elephants was released to the wild while the other was kept
in captivity. Another 8 wild areas were selected at random.
From each wild area two elephants from a litter were
selected. One of the two elephants was captured while the
other was allowed to remain in the wild. If w, d and c denote
“wild”, “difference” and “captivity” then:
sw=4, sc=6, sd=3, xw=68, xc=65, xd=3, Example 3
A student is given 3 choices for courses in the
winter term. She will select a math course with 60%
probability, a Science course with 35% and the
remainder for a Business course. If she takes the
business course, the chance she passes is 85%. If
she takes the math, the chance she passes is 92%.
If she takes the science course, the chance she
passes is 73%.
A) What is the probability that she passes?
B) What is the probability that the course she
passed was a science? Example 4
The number of toxins in a politicians blood is determined.
(Dalton McGinty had 41 in the last election). The average
number of toxins in 400 20 year olds is 24 with a standard
deviation of 3 toxins.
a) Build a 95% interval for the number of toxins in the typical
Canadian?
b) Do you believe we have evidence that the number could
be zero? Answer using information from part a.
c) Assuming these 400 people form a population, estimate
the probability that the average number of toxins in 16
people selected at random from these 400 is more than 22. Example 5
The number of homes in default is in decline. In
the Canadian population 1/53 homes are in
default. If 100 homes are selected at random,
what is the approximate probability that less than
2 homes are in default? ANSWERS....Given in Tutorial Please note:
1. If you do not come to the
help session, I do not guarantee
that the notes/video will be
available to you online.
2. I may not have time to cover
everything in these slides and
have no solutions for the Example 1
Scientists are curious about CO2 levels and acid
rain. 120 areas are measured for CO2 and
acidity. It is found that the standard deviation of
CO2 and acid levels are 7 and 0.3 respectively.
Further, the covariance between CO2 and acidity
is 1.
a) Interpret the covariance.
b) What is the coefficient of determination?
c) Test whether the slope of the linear relationship
between CO2 and acidity is less than zero. Notes
Formula:
d = r√[(n2)/(1r2)]
This has a t distribution on n2 degrees of
freedom.
Pvalue: Same
Conclusion: Same Example 1
Scientists are curious about CO2 levels and acid
rain. 120 areas are measured for CO2 and
acidity. It is found that the standard deviation of
CO2 and acid levels are 7 and 0.3 respectively.
Further, the covariance between CO2 and acidity
is 1.
a) Interpret the covariance. Example 1
Scientists are curious about CO2 levels and acid
rain. 120 areas are measured for CO2 and
acidity. It is found that the standard deviation of
CO2 and acid levels are 7 and 0.3 respectively.
Further, the covariance between CO2 and acidity
is 1.
b) What is the coefficient of determination? Example 1
Scientists are curious about CO2 levels and acid
rain. 120 areas are measured for CO2 and
acidity. It is found that the standard deviation of
CO2 and acid levels are 7 and 0.3 respectively.
Further, the covariance between CO2 and acidity
is 1.
c) Test whether the slope of the linear relationship
between CO2 and acidity is less than zero. Example 2
Scientists believe that elephants who ended their lives in
captivity live shorter lives than those kept in the wild. The
length of time the elephant lives is recorded. In each of the
following situations, build an appropriate hypthesis test to
test the following situations:
A. Several groups of elephants are selected. The first
group, comprised of 10 elephants who have lived in the wild
their entire lives. We compare this group to the second
group of 10 elephants who have lived in a zoo their entire
lives. A third group of 14 elephants who started their lives in
the wild and ended their lives in captivity are compared to a
group of 14 elephants who started their lives in captivity and
were released to the wild. If w, d and c denote “wild”,
“difference” and “captivity” then: sw=4, sc=6, sd=3, xw=68,
xc=65, xd=3, Hypothesis Formula Pvalue Conclusion Example 2
Scientists believe that elephants who ended their lives in
captivity live shorter lives than those kept in the wild. The
length of time the elephant lives is recorded. In each of the
following situations, build an appropriate hypthesis test to
test the following situations:
B. 10 zoos were selected at random. From each zoo two
elephants from the same litter were selected. One of the
elephants was released to the wild while the other was kept
in captivity. Another 8 wild areas were selected at random.
From each wild area two elephants from a litter were
selected. One of the two elephants was captured while the
other was allowed to remain in the wild. If w, d and c denote
“wild”, “difference” and “captivity” then:
sw=4, sc=6, sd=3, xw=68, xc=65, xd=3, Hypothesis Formula Pvalue Conclusion Example 3
A student is given 3 choices for courses in the
winter term. She will select a math course with 60%
probability, a Science course with 35% and the
remainder for a Business course. If she takes the
business course, the chance she passes is 85%. If
she takes the math, the chance she passes is 92%.
If she takes the science course, the chance she
passes is 73%.
A) What is the probability that she passes?
B) What is the probability that the course she
passed was a science? Example 4
The number of toxins in a politicians blood is determined.
(Dalton McGinty had 41 in the last election). The average
number of toxins in 400 20 year olds is 24 with a standard
deviation of 3 toxins.
a) Build a 95% interval for the number of toxins in the typical
Canadian?
b) Do you believe we have evidence that the number could
be zero? Answer using information from part a.
c) Assuming these 400 people form a population, estimate
the probability that the average number of toxins in 16
people selected at random from these 400 is more than 22. Example 5
The number of homes in default is in decline. In
the Canadian population 1/53 homes are in
default. If 100 homes are selected at random,
what is the approximate probability that less than
2 homes are in default? ...
View
Full
Document
This note was uploaded on 02/01/2011 for the course STAT 202 taught by Professor Springer during the Spring '09 term at Waterloo.
 Spring '09
 SPRINGER

Click to edit the document details