Stats 202

# Stats 202 - Basics of the Course The course is taught...

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Basics of the Course The course is taught using powerpoint. These lecture notes WILL change as the term progresses. Special Slide Pictures Data Notation Denote data by x1, x2 , x3, … xn where n is the number of data values we have, called the sample size. The collection of x1, x2 , x3, … xn is called a dataset whereas a particular value xi is called a datum, data value or observation. Example (Spoof) http://ca.youtube.com/watch?v=MQw12_kNAhU&feature=related Example Description This data set gives the average heights and weights for American women aged 30–39. Obs 1 2 3 4 5 height weight 58 115 59 117 60 120 61 123 62 126 Data Types NOTE: A quantitative variable can be made qualitative…we’ll see in a second… Examples with Clickers Height 2) Grades Clicker Responses: A) Qualitative B) Categorical C) Quantitative D) Both A and B 1) Quantitative Data Examples with Clickers 1) Height 2) Number of Cats Owned by Canadians Clicker Responses: A) Discrete B) Continuous C) Both D) Niether Qualitative Data Examples with Clickers Body Size: Skinny, Normal, Obese 2) Type of Stone: Granite Clicker Responses: A) Discrete B) Continuous C) Both D) Neither 1) Analysis Raw data is hard to analyse. For example, consider the Ph values below. Remember a Ph of 7 is neutral, a Ph <7 acidic and a Ph > 7 basic. Are the 100 lakes below acidic, basic or neutral? Data 5 6867887868677557 98 7 6 8 8 8 5 6 8 8 5 7 6 7 8 8 4866666788577578 7866768868877876 10 6 8 6 6 8 6 6 7 8 7 6 8 7 6 7 7765787776687777 87 Well??? Dataset Characteristics 3 Characteristics 1. 2. 3. Analysis Two Techniques: 1. 2. Example Dataset Consider the following data: 1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7 We can build a display simply by ticking off every time we see a number. 1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7 Center Rough Definition – The middle of the data Pictorially - Spread Rough Definition - How separated our data values are. Shape The appearance of the data. Shape The shape of a dataset can be determined numerically using measures such as Kurtosis and Skew – but we will not investigate these statistics in this course. Center There are 3 measures of center: A) B) C) Mode The most popular value. Also the most useless statistic. e.g. 1,1,1,2,20 Mean You would call it an “average”. Notation: Data: Mean: Example Consider the data: 1, 1, The mean is: 1, 2, 20 Median The middle value of the data. Notation (Median): Notation (Sorted Data): Algorithm: Median Given the data: x1, x2 , x3, … xn. 1. Sort the data from smallest to largest. 2. If n is odd, then take the middle value. 3. Else if n is even, take the average of the middle 2 values. Example 1. Sort 2. n odd = middle n even = average 1,1,20,2,1 Example 1,1,20,3,1,4,12,2 What is the median? A) 1 B) 1.5 C) 2 D) 2.5 E) 3 Example 1. Sort 1, 1, 1, 2, 3, 4,12, 20 2. n odd = middle n=8 n even = average Q2 = Outliers Outliers are values that are more extreme than the others. For example: 1, 2, 3, 4, 1000 For example: -0.8, -11, 0.1, -0.6, 1, 0.3, -0.9 Summary: 1,1,1,2,20 Mode 1 Mean 5 Median 1 Question Why is the mean different from the median and mode? Order Statistics The median is called the “second quartile”. This implies there are “other” quartiles. A quartile derives it's name from quarter and each quartile divides the data into quarters. Pictorially In Words 25% of our data is below Q1, the first quartile. 50% of our data is below Q2, the second quartile. 75% of our data is below Q3, the third quartile. Algorithm: Q1 1. Perform the Median Algorithm. 2. Remove all datum above the median. 3. Perform the Median Algorithm on the remaining data. 4. This is the middle of the lower half of the data, the first quartile. Example Given the data: -0.8, -11, 0.1, -0.6, 1, 0.3 1. Sort it -11.0, -0.8, -0.6, 0.1, 0.3, 1.0 2. RECALL Dataset Characteristics 3 Characteristics 1. Center 2. Spread 3. Shape Spread There are several ways in which we can calculate spread: 3. 4. 1. 2. 5. Range The range gives the distance between the largest and smallest values. Formula in Words: Formula with Notation: Interquartile Range The interquartile range gives the distance covered by the middle 50% of the data. Formula: Data: Which dataset has more spread? A) 1 B) 2 C) 3 D) 1 = 2 E) none of the above Data 1: 1, 2, 3 Data 2: 1, 1, 1, 2, 3, 3, 3 Data 3: 100, 100.5, 101, 101.5, 102 Range Calculation Data 1: 1, 2, 3 Data 2: 1, 1, 1, 2, 3, 3, 3 Data 3: 100, 100.5, 101, 101.5, 102 IQR Calculation Data 1: 1, 2, 3 Data 2: 1, 1, 1, 2, 3, 3, 3 Data 3: 100, 100.5, 101, 101.5, 102 Standard Deviation In words: the standard deviation is approximately the average distance the data values are from the center. Formulas 1. Not nice for Calculation, but great for interpretation. Formulas 2. Useful for calculation but NOT interpretation. Formulas 3. Another one! Useful for calculation but NOT interpretation. 3 Formulae Example Consider the data 1, 2, 3. Calculate the st. dev. Example Given: 10 ∑i 10 2 i x i=1 0 ; ∑ i x =1 5 0 0 Calculate the standard deviation: Example Given: 10 ∑i 10 2 i x i=1 0 ; ∑ i x =1 5 0 0 Calculate the standard deviation: Interpretation Deviation Definition in words: Definition numerically: Standard Deviation The standard deviation is approximately, the average deviation. Why approximately???? Deviation Example Consider the data: 1, 2, 3 Calculate the average deviation: Clicker Question Pick 3 numbers. Calculate the average deviation. The answer is: A) 0 B) >0 C) <0 D) I just want the clicker mark. E) None of the above. What's the problem!!!??? How do we correct it????? Other Issues But…. 1.Square rooting doesn’t undo squared terms! Example: √(12+22+32) ≠ √12+ √22 + √32 2. Because of “1”, our value for s is too small, so we divide by n-1 instead of n. n vs n-1 Degrees of Freedom n-1 is called the degrees of freedom. Another way of thinking about degrees of freedom: Suppose I gave you n data values at random. They are “free” to be whatever I want them to be. Degrees of Freedom Continued Now, instead of n data values, I give you n-1 values + the average. Is that last data value, the nth, “free”? Range Vs. Standard Deviation (Typical Plot) Standard Deviations minimum Center maximum Maximum - Mininmum = 6s Which means.... s = Range/6 Note: Sometimes it is not 6 but 4 or another constant...this depends on the data. Interpretation For a set of data, the standard deviation is 5. Is this big, small or uncertain? A) Big B) Small C) Uncertain Interpretation and Units Variance The variance is merely the square of the standard deviation. Notation: Coefficient of Variation (CV) Formula: Interpretation/Use: Example The length of fish Riley catches m's on Monday: 1, 2, 3 In cm's on Tuesday: 100, 200, 300 Surface Investigation Monday Tuesday Which has the greatest spread? A) Monday B) Tuesday C) Neither Answer Units The standard deviation, mean, mode, median all have the same units as the data. The variance, which is equal to standard deviation squared has units squared. Graphical Techniques In addition to numeric techniques, we have graphical techniques that can be used to analyze data. These graphical techniques include boxplots, dot plots etc… Example Dataset Consider the following data: 1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7 We can build a display simply by ticking off every time we see a number. 1,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7 Dotplots A dot plot is similar to this tick mark game that we've played since children. Each data value is plotted and replaced by a point. Hence the data 1,2,3 would look like: 1 2 3 Dotplots with Repeats For a single set of data we may be interested in the repeats. In such a case we may draw a dot for every repeat. Eg. 1,1,2,3 1 2 3 Example: Soybean What can you see with this plot?? Frequency Distribution Example Example: Who is your favourite actor? A) Brad Pitt B) This guy C) Angelina Jolie D) Her E) Someone else/don't want to answer Frequency We build bars which have a height equal to the frequency with which a response occurs. Non-Categorical Data If our data is not categorical, we first build intervals for the data. Intervals are created subjectively but should all be the same size. The x axis contains the intervals while the y is the frequency. Example: Grades What is your Calculus 1 grade? A) 85% to 100% B) 70% to 85% C) 55% to 70% D) 40% to 55% E) Prefer not to say. Intervals These intervals are chosen subjectively. I could have chosen any set. I did try to chose them to make them all the same size. Clicker Questions The shape is: A) Bell B) Skewed left C) Skewed right D) uniform (flat) E) none of the above Clicker Questions The center is: A) 576 B) 578 C) 579 D) 581 E) none of the above Relative Frequency Example We divide each freqency by n. The plot is otherwise the same. Example 0 .2 0 0 .1 5 0 .1 0 0 .0 5 0 .0 0 D e n s ity 0 .2 5 0 .3 0 0 .3 5 D e p th o f L a k e H u r o n in F e e t 1 8 7 5 - 1 9 7 2 575 576 577 578 579 L a k e H u ro n 580 581 582 Clicker Question What is the proportion of times that lake Huron was less than 578 feet deep? A) 10% B) 12% C) 24% D) Not able to say. Boxplots Unmodified Boxplot Min Q1 Q2 IQR=Q3-Q1 Range = Max - Min Q3 Max Recall: Outliers Outliers: Data values that are more extreme (larger or smaller) than the others. E.g. 1,1,2,2,3,3,4,4,5,5,6,6,25 Finding Outliers What is an outlier mathematically? Obviously from the data above the number 25 is suspect. Any value that is: Less than the lower limit: LL=Q1-1.5(IQR) Greater than the upper limit: UL= Q3+1.5(IQR) Why 1.5 times?? Math to Prove 25 is an Outlier 1,1,2,2,3,3,4,4,5,5,6,6,25 Example Continued 1,1,2,2,3,3,4,4,5,5,6,6,25 Example Continued 1,1,2,2,3,3,4,4,5,5,6,6,25 Modified Boxplot Unless stated otherwise I am asking about the modified boxplot! The difference: The upper whiskers are either the maximum or the closest point below the UL to the center. The lower whiskers are either the minimum or closest point to the LL, which ever is closer to the center. Modified Boxplot Q1 Q2 outlier IQR=Q3-Q1 Range = Max - Min Q3 Example Using: 1,1,2,2,3,3,4,4,5,5,6,6,25 Boxplots and Shape • The box (Q1 to Q3) gives a good indication of the shape of our data. » A » »C B Boxplot A is: A) Symmetric (Bell) B) Skewed left C) Skewed right D) Uniform (flat) E) None of the above. Boxplots and Shape • The box (Q1 to Q3) gives a good indication of the shape of our data. » A » »C B Boxplot B is: A) Symmetric (Bell) B) Skewed left C) Skewed right D) Uniform (flat) E) None of the above. Stem And Leaf Plots Loss of Information Individual data values are lost when we draw a boxplot, histogram, dot plot etc… The Stem and Leaf plot attempts to counter this issue. Example: Problem: Measurements of the annual flow of the river Nile at Ashwan 1871–1970. Plan: Not relevant. Data 1120 1160 963 1210 1160 1160 813 1230 1370 1140 995 935 1110 994 1020 960 1180 799 958 1140 1100 1210 1150 1250 1260 1220 1030 1100 774 840 874 694 940 833 701 916 692 1020 1050 969 831 726 456 824 702 1120 1100 832 764 821 768 845 864 862 698 845 744 796 1040 759 781 865 845 944 984 897 822 1010 771 676 649 846 812 742 8011040 860 874 848 890 744 749 838 1050 918 986 797 923 975 815 1020 906 901 1170 912 746 919 718 714 740 Stem and Leaf Plot Parts The decimal point is 2 digit(s) to the right of the | 4|6 5| 6 | 5899 7 | 000123444455667778 8 | 000011222233344555556667779 9 | 0011222244466678899 10 | 0122234455 11 | 00012244566678 12 | 112356 13 | 7 Stem and Leaf Plot Example The decimal point is 2 digit(s) to the right of the | 4|6 5| 6 | 5899 7 | 000123444455667778 8 | 000011222233344555556667779 9 | 0011222244466678899 10 | 0122234455 11 | 00012244566678 12 | 112356 13 | 7 Stem and Leaf Plot What do you notice???? The decimal point is 2 digit(s) to the right of the | 4|6 5| 6 | 5899 7 | 000123444455667778 8 | 000011222233344555556667779 9 | 0011222244466678899 10 | 0122234455 11 | 00012244566678 12 | 112356 13 | 7 Parts 1) Legend: “The decimal point is 2 digit(s) to the right of the |” a) This tells me that the numbers are 4|6=460. b) If it had said “2 digit(s) to the LEFT of the |” then 4| 6=0.046 2) Stem is the part to the left of “|” 3) Leaves are the parts to the right of the “|” 4) Each leaf represents a data value. Hence we have 6 data values starting with 12. Example Measurements of vein diameters were taken on 100 patients. The following stem and leaf plot was obtained. Example Continued The decimal point is 2 digit(s) to the left of the | 32 | 78 33 | 224 33 | 5577777899 34 | 0000011111233333444 34 | 5566666678888888999 35 | 0001111111122223344 35 | 5555677788889999 36 | 0112244 36 | 56678 Based on the Legend 32|1 Means: A) 321 B) 32.1 C) 3201 D) 3.21 E) None of the above The decimal point is 2 digit(s) to the left of the | 32 | 78 33 | 224 33 | 5577777899 34 | 0000011111233333444 34 | 5566666678888888999 35 | 0001111111122223344 35 | 5555677788889999 36 | 0112244 36 | 56678 What do you notice that is interesting about the stems??? Why was this done?? The decimal point is 2 digit(s) to the left of the | 32 | 78 33 | 224 33 | 5577777899 34 | 0000011111233333444 34 | 5566666678888888999 35 | 0001111111122223344 35 | 5555677788889999 36 | 0112244 36 | 56678 Example: Problem: Does the stress of machinery affect the ability of a soya plant to grow? Further, does the amount of light influence it’s ability to grow? Plan: 52 seeds were potted with one seed per pot. The 52 seeds were randomly divided into 4 samples with 13 seeds per sample. The seeds in 2 samples were stressed by being shaken for 20 minutes daily, while the seeds in the other two were not shaken (no stress). The two samples that received the same exposure to stress were grown under different levels of light. Thus the four samples of plants were allocated to one of 4 treatments that were defined by 2 basic treatments, stress and light. Data: ln ly mn my 264 235 314 283 200 188 320 312 225 195 310 291 268 205 340 259 215 212 299 216 241 214 268 201 232 182 345 267 256 215 271 326 229 272 285 241 288 163 309 291 253 230 337 269 288 255 282 282 230 202 273 257 Analysis: Under which conditions would you want to grow your Soybeans? A) B) C) D) Moderate Light, Stress Low Light, Stress Moderate Light, no stress Low light, no stress Example 2 - View Article From: Medical Article http://www.amstat.org/publications/jse/v11n2/datasets.heinz.html Problem: To investigate the human body. Plan: Measure the items shown at left on males and females. Data: Measurements of 247 men & 260 women Analysis: See article on last slide. Is is possible for the Biacromial Measurement of a particular female exceeds that of a particular male? Yes B) No C) zzzzzzz A) Probability We can define probability in 3 ways. Subjective Relative frequency Mathematical / classical Subjective Based on intuition we guess what the probability is. i.e. There’s a 99% chance I’ll pass! Subjective Adv: Disad: Relative frequency The probability of something happening is the number of times it occurs divided by the # of attempts. e.g. Coins Pretend everyone in class is using the same coin. Flip it. What did you get?? A) Heads B) Tails Question Will you write the quizzes more than once even if you got 100% on the first try? A) Yes B) No Relative Frequency Adv: Disad: Classical Experiment A theoretically repeatable process or phenomenon e.g. Trial e.g. One repetition of an experiment Classical ctd. Outcome The result of our experiment. Also called a “simple” event We use capital letters to denote outcomes e.g. A Classical continued Compound Event: e.g. If an event A is made up of more than one “simple event” Classical Ctd Universe or Sample Space: The collection of all outcomes of an experiment. We denote it by “S”. e.g. Review An outcome might be A = roll a one An event might be, get an even #, B = {2, 4, 6} The size of an event/sample space is the objects/simple events in it. We size by |B| e.g. B = {2, 4, 6} |B| = 3 # of denote the Probability Let E be an event containing |E| simple outcomes. Let S be the sample space with |S| simple outcomes. Then the probability E occurs is Pr(E)=|E|/|S| Example 1. What is the probability of getting a head on a coin? e.g. A biologist classifies a colony of wild baboons by fur colour. E = having light-coloured fur Of 150 animals observed, 5 are light-coloured P (light-coloured fur) = Example In a genetic experiment brown rabbits are crossed with black rabbits. As a result, of the 44 progeny, 13 are brown and 5 are black. The remainder are mottled (various colours). What is the probability you select a mottled rabbit? Properties of Probabilities 1. 2. Properties of Probabilities 3. 4. Properties of Probabilities 1 0 P)1 ) ≤( ≤ E 2)P ( E ) = 0 ≡ E never happens 3 P = E ah e ) ( 1 a ya n E ≡ l sps wp ) 4 EEK ,E representssimple events mutually ) 12 , , m exclusive all possible and P1+(2+ +(m= EE E ( ) P )K P )1 Leading Questions What if… we want to know the probability we select either a brown OR mottled rabbit? We want to know the probability that in 2 tries we select a brown AND a mottled rabbit? Symbol 1 “OR” Notationally we write: In words we mean: Symbol 2 “AND” Notationally we write: In words we mean: Symbol 3 “Not” Notationally we write: In words we mean: Venn Diagrams A Venn diagram is a pictorial representation of our probability The box is the sample space. e.g. A circle within the box denotes a probability for an event. e.g. Mutually Exclusive Two events are mutually exclusive (ME) if they have no outcomes in common or cannot occur together. e.g. ME Events: e.g. Not ME Events: Clicker ME Is the event “Person wears glasses” mutually exclusive from the event “Person has freckles”? A) Yes B) No C) Uncertain Mutual Exclusion Are the events A = Roll a one on dice 1; B = Roll a one on dice 2; mutually exclusive (ME)? A) Yes B) No Venn Diagram In the following Venn diagram, the square represents the…(best answer) A) B) C) D) Event Simple Event An Outcome Sample Space ME ME and VENN Diagrams If two events are ME or disjoint, the circles are also disjoint. e.g. Hence P o) P)P) r r =( + ( . ( Br A Ar B Or in terms of our notation: ME and VENN Diagrams If two events are not ME, they overlap: e.g. Hence P o ) P ) P) P B r r =( +( −( ) ( Br A Ar Br A Proof by Picture: P o ) P ) P) P B r r =( +( −( ) ( Br A Ar Br A Example Problem: To investigate Seal pup fur colour. Plan: Pups Categorized by Coat Colour and Sex Data Sex Colour Male Female Total Yellow 25 10 35 Thin White 10 5 15 Fat White 25 5 30 Grey 15 5 20 Total 75 25 N = 100 Notation Let G denote Grey. Let Y denote Yellow. Let M denote Male. Let W denote White. Let T denote Thin. Question 0 What is the probability a pup is not thin and white? Sex Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N= 100 Question 1 What is the probability a coat is Yellow? Sex B) 10/100 C) 35/100 D) 25/75 M F Total Y A) 25/100 Colour 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N= 100 Details Details.... Sex Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N= 100 Question 2 What is the probability a coat is Yellow or Grey? Sex Colour M F Total Y 25 10 35 TW 10 5 15 B) 40/100 FW 25 5 30 C) 55/100 G 15 5 20 D) 40/75 Total 75 25 N= 100 A) 25/100 E) None of the Above Details Details.... Sex Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N= 100 Question 3 What is the probability a randomly selected pup is yellow and male? A) 85/100 B) 75/100 C) 35/100 D) 25/100 Sex Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N= 100 Details Details.... Sex Colour M F Total Y 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N= 100 Question 4 Are the events yellow and male ME? Sex Colour M F Total A) Yes Y 25 10 35 B) No TW 10 5 15 C) Can't say FW 25 5 30 G 15 5 20 Total 75 25 N= 100 Question 4 - Start What about Yellow OR male?? What is the probability a randomly Colour selected pup is yellow Y OR male? Sex M F Total 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N= 100 Details, Details… What about Yellow OR male?? What is the probability a randomly Colour selected pup is yellow Y OR male? Sex M F Total 25 10 35 TW 10 5 15 FW 25 5 30 G 15 5 20 Total 75 25 N= 100 Independent Events CAREFUL! Independence is a statistical (read MATHEMATICAL) concept. Two events are independent IF Pr(AB) = Pr(A)Pr(B) Also, and this is mathematically subtle, IF Pr(AB)=Pr(A)Pr(B) then A and B are independent. Independence in WORDS Two events are independent (not associated) if the chance that one event occurs is not affected by the knowledge of whether or not the other event occurred. Smoking It is known that 5% of people get lung cancer, 10% of people smoke and 0.5% of people smoke and get lung cancer. Is smoking independent of lung cancer ? A) Yes B) No C) Too little information to say. Details Details... Example Problem: To investigate the demographic of those people who watch “House”. Plan: Divide by gender and age. Data Sex Age Male Female Total Youth (<=16) 25 10 35 Gen X (17 to 35) 10 5 15 Middle (36 to 64) Senior (>=65) 25 5 30 15 5 20 Total 75 25 N = 100 Notation Let G denote Gen X. Let Y denote Youth. Let M denote Middle. Let T denote Senior. Question Is the event Youth, independent of the event Male? Sex Age M F Total Y 25 10 35 G 10 5 15 B) No M 25 5 30 C) Not enough information T 15 5 20 Total 75 25 N= 100 A) Yes Details Details.... Sex Age M F Total Y 25 10 35 G 10 5 15 M 25 5 30 T 15 5 20 Total 75 25 N= 100 With and Without Replacement With Replacement – Without Replacement – After selecting an item we replace it. After selecting an item we do NOT replace it. E.g. Dice, coins, selecting a card and replacing it in the deck Eg. Dealing cards Example What is the probability that, in selecting 2 male viewers at random in a row and with replacement? Sex Age Total B) 75% C) 150% D) None of the above F Y 25 10 35 G A) 56.25% M 10 5 15 M 25 5 30 T 15 5 20 Total 75 25 N= 100 Details Details.... Sex Age M F Total Y 25 10 35 G 10 5 15 M 25 5 30 T 15 5 20 Total 75 25 N= 100 Example What is the probability of selecting 2 male viewers in a row at random and without replacement? Age Sex Total M F Y 25 10 35 A) 55.9% G 10 5 15 B) 56.06% M 25 5 30 T 15 5 20 Total 75 25 N= 100 C) 56.25% D) None of the above Details Details.... Sex Age M F Total Y 25 10 35 G 10 5 15 M 25 5 30 T 15 5 20 Total 75 25 N= 100 Independence Vs. ME Independence Pr(AB)=Pr(A)Pr(B) ME Pr(AB) = 0 Questions Can you think of 2 events which are… Independent but are not ME? Independent but are ME? Not independent but are not ME? Not independent but are ME? Conditional Probability Often we ask questions like: What is the probability I pass if I study? What is the probability I win a race if I stretch first What is the probability I get a job if I don’t wear ripped jeans? In all cases we are asking for a probability under the assumption something else has already occurred. Conditional Probability In Words: Conditional Probability Mathematically/Formulaically: Using dice... What is the probability of rolling a 2 or 3 given the number rolled is odd? Let A be the event odd. Let B be the event of a 2 or 3. Numerical Venn Diagram: Conditional Probability Pictorially with a Venn Diagram: Example e.g. The probability Joe studies for an exam is 75%. The probability Joe passes and studies is 55%. Find the probability Joe passes given he studies. Step 1: Diamond Mining….. e.g. The probability Joe studies for an exam is 75%. The probability Joe passes and studies is 55%. Find the probability Joe passes given he studies. Step 2: Putting it together…. Example e.g. The probability that a company goes bankrupt given their stock has decreased in value at least 17% this year is 26%. The probability that the stock decreased in value by at least 17% is 62%. What is the probability that a company goes both bankrupt and has at least a 17% decrease in stock? e.g. The probability that a company goes bankrupt given their stock has decreased in value at least 17% this year is 26%. The probability that the stock decreased in value by at least 17% is 62%. What is the probability that a company goes both bankrupt and has at least a 17% decrease in stock? Conditional Probability Tree Diagram Example Tree Diagrams What? A display of branches and nodes. At each node we make choices (branches). Use when? In conditional probability problems. How? We label each branch as Pr(A|…) where “…” are the events that have occurred before A. Method 1. 2. 3. 4. 5. Read the question and find the two events. Determine which even is first. Determine the probabilities, Pr(A|…) Label a tree with the probabilities. Find Pr(AB) for each branch. Recall The conditional probability formula: Example Tree Diagram Tree Diagram Notes 1. Adding the branches “Pr(AB)” gives a total of 100%. Tree Diagram Notes 2. Along a branch we say the word AND Study Pass | Study Study AND Pass| Study 3. Between branches we say the word OR OR Not Study AND Pass|Not Study HIV Example HIV Example A certain test of HIV is correct 95% of the time if a person has HIV and 98% of the time if the person does not have HIV. 8% of people tested are thought to be HIV positive. 1. 2. 3. 4. 5. Read the question and find the two events. Determine which even is first. Determine the probabilities, Pr(A|…) Label a tree with the probabilities. Find Pr(AB) for each branch. HIV Example A certain test of HIV is correct 95% of the time if a person has HIV and 98% of the time if the person does not have HIV. 8% of people tested are thought to be HIV positive. 1. 2. 3. 4. 5. Read the question and find the two events. Determine which even is first. Determine the probabilities, Pr(A|…) Label a tree with the probabilities. Find Pr(AB) for each branch. Events, Outcomes and Order A certain test of HIV is correct 95% of the time if a person has HIV and 98% of the time if the person does not have HIV. 8% of people tested are thought to be HIV positive. What is the order of the events? A) Have HIV (H) then Test Positive (P) B) Test Positive (P) and then Have HIV (H) C) none of the above HIV Example A certain test of HIV is correct 95% of the time if a person has HIV and 98% of the time if the person does not have HIV. 8% of people tested are thought to be HIV positive. 1. 2. 3. 4. 5. Read the question and find the two events. Determine which even is first. Determine the probabilities, Pr(A|…) Label a tree with the probabilities. Find Pr(AB) for each branch. HIV Example A certain test of HIV is correct 95% of the time if a person has HIV and 98% of the time if the person does not have HIV. 8% of people tested are thought to be HIV positive. HIV Example A certain test of HIV is correct 95% of the time if a person has HIV and 98% of the time if the person does not have HIV. 8% of people tested are thought to be HIV positive. 1. 2. 3. 4. 5. Read the question and find the two events. Determine which even is first. Determine the probabilities, Pr(A|…) Label a tree with the probabilities. Find Pr(AB) for each branch. Using Numbers Pr (P|H) Pr (H) = 8% = 95% Pr (Not P|H) = 5% Pr (not P|not H) = 98% Pr (Not H) = 92% Pr (P|Not H) = 2% HIV Example A certain test of HIV is correct 95% of the time if a person has HIV and 98% of the time if the person does not have HIV. 8% of people tested are thought to be HIV positive. 1. 2. 3. 4. 5. Read the question and find the two events. Determine which even is first. Determine the probabilities, Pr(A|…) Label a tree with the probabilities. Find Pr(AB) for each branch. Using Numbers Pr (P|H) Pr (H) = 8% Pr(PH)=7.6% = 95% Pr (Not P|H) = 5% Pr(not PH)=0.4% Pr (not P|not H) = 98% Pr(not P, not H)=90.16% Pr (Not H) = 92% Pr (P|Not H) = 2% Pr(P not H)=1.84% Answering the Questions… 0. What is the probability of having HIV and testing positive for HIV? A) 7.6% B) 8% C) 95% D) 90.16% Solution Answering the Questions… 1. What is the probability of testing positive for HIV? A) 7.6% B) 9.4% C) 95% D) 90.16% Solution Answering the Questions… 2. What is the probability a randomly selected person does not have HIV given they tested positive? Solution Question 3: Aviation In any crash landing it is known that a black box will survive (be found, still work) 85% of the time. After a flight there is a 0.5% probability that a black box will not work. The black box is tested after each flight. Airplane crashes occur on 1% of flights. What is the probability that a black box will survive a flight? Method 1. 2. 3. 4. 5. Read the question and find the two events. Determine which even is first. Determine the probabilities, Pr(A|…) Label a tree with the probabilities. Find Pr(AB) for each branch. Conditional Probability and Independence Recall: Two events are independent (not associated) if the chance that one event occurs is not affected by the knowledge of whether or not the other event occurred. Proof Let A and B be independent events: Then Pr(A|B) = Example What is the probability of flipping a head on the next flip of a fair coin given we have already flipped 100 heads in a row? A) 0 B) Very very small C) 0.5 D) 120% E) none of the above Sampling without Replacement There are 12 people in my class, 7 males and 5 females. I select two people at random and without replacement. What is the probability that I my first selected person is male if it is known I selected one male and one female? Method 1. 2. 3. 4. 5. Read the question and find the two events. Determine which even is first. Determine the probabilities, Pr(A|…) Label a tree with the probabilities. Find Pr(AB) for each branch. There are 12 people in my class, 7 males and 5 females. I select two people at random and without replacement. What is the probability that I my first selected person is male if it is known I selected one male and one female? Method 1. 2. 3. 4. 5. Read the question and find the two events. Determine which even is first. Determine the probabilities, Pr(A|…) Label a tree with the probabilities. Find Pr(AB) for each branch. There are 12 people in my class, 7 males and 5 females. I select two people at random and without replacement. What is the probability that I my first selected person is male if it is known I selected one male and one female? Monte Hall Problem Monte Hall was the game show host of “Let’s Make a Deal”. The game worked as follows: 1. Pick a Door A B C 2. You pick 1 door… A B C 3. Monte Hall Shows you a dud… A B C 4. And asks if you would like to switch your choice from A to B…? A B C Do you select door A (no switch) or B (switch)? A) B) No switch from A. Switch to door B. The question…does switching change your chances of winning? A) If you switch your chance of winning is greater. B) If you do not switch your chance of winning is greater. C) Doesn’t matter if you switch, your chance of winning is 0.5. Experiment In groups of 2 (a host and a contestant), draw three doors. The host will pick a winning door but NOT tell the contestant. Experiment Part 2 The contestant will pick a door. The host will then tell the contestant which of the other two doors is a dud. Experiment Part 3 Finally, the host will let the contestant either stick with their door or switch. After the contestant has made their choice, the host will tell them where the right answer is. Switch Data A) B) You switched and WON You switched and LOST C) D) E) You did not switch and WON You did not switch and LOST The Events Define the events: Which is first? A) B) Selected Door Switch Tree Diagram Solution – Do you have a higher chance of winning if you switch??? Random Variables Random Variables Definition: A random variable is a variable (think x, …) that depends on the outcomes of a chance operation Concept: It turns outcomes into numbers e.g. Coin Flipping Random Variable Notation: Note: X, capitalized is random variable (r.v.) x vs X e.g. Coin Flipping f(x) f(x) = Pr(X=x) –> the probability that X becomes x. We call this a probability function. It has a value for every value of x. Since f(x) is a probability it has all the same properties as a probability. Example 1 f(x) = Example 2 We often build a table of x and f(x) values, called a distribution. e.g. Coin Flipping Probability Properties 1. 2. Histogram A diagram for a distribution. Histogram We draw a bar for every value x and height f(x). The area/height of the bar represents the probability of x occuring. Example For the distribution below, find c. x -1 0 2 ----------------------------------------------f(x) 0.3 c 0.6 Clicker Example How many siblings do you have; A=0, B=1, C=2, D=3, E>=4? Notice how the clicker builds for us a (Relative) Frequency Diagram, here after called a histogram. Example 1 In a particular stock portfollio consisting of 100 companies 33 are considered to have no risk, 21 are considered to be conservative, 42 are moderately risky and the remainder are risky. Build the distribution and histogram for the above data. (continued next slide) In a particular stock portfollio consisting of 100 companies 33 are considered to have no risk, 21 are considered to be conservative, 42 are moderately risky and the remainder are risky. Wording Consider the numbers 1, 2, 3, 4, 5, 6. If we say that our answer is at most 4, then our answer can A) Include 4 B) Not include 4 Wording Consider the numbers 1, 2, 3, 4, 5, 6. If we say that our answer is at least 4, then our answer can A) Include 4 B) Not include 4 Wording Consider the numbers 1, 2, 3, 4, 5, 6. If we say that our answer is less than 4, then our answer can A) Include 4 B) Not include 4 Example 2 Let X be the number of rabbit progeny from one union. The probability distribution for X is below . X 5 f(x) 0.25 0 1 0.3 0.05 2 3 4 0.22 0.17 0.01 Example 2 Continued What is the probability that a rabbit union results in less than 2 progeny? X 0 1 2 3 4 5 f(x) 0.25 Clicker: 0.3 A) B) C) D) 60% 55% 30% 25% 0.05 0.22 0.17 0.01 Example 2 Continued What is the probability that a rabbit union results in at most 2 progeny? X 0 5 1 f(x) 0.25 0.3 Clicker: 0.01 A) B) C) D) 60% 55% 30% 25% 2 3 0.05 0.22 4 0.17 Cummulative Distribution Function The probability distribution function is f(x) = Pr(X=x). The cummulative distribution function is F(x)=Pr(X<=x). 1. 2. f(x) = F(x) – F(x-1) Pr(X>x) = 1-Pr(X<=x) Properties Example 1 A polar bear gives birth to 0, 1 or 2 live cubs with probability 0.1, 0.2 and 0.7. A) Draw the probability distribution function B) State and Draw the cumulative distribution function. Recall The PDF (or PF) is denoted by f(x) = Pr(X=x). The CDF, denoted by F(x) = Pr(X<=x), has the following useful relationships: i. Pr(X>x) = 1 – Pr(X<=x) = 1 – F(x) ii. f(x) = F(x) – F(x-1) Example 2 The number of celery seeds that germinate in a packet of 5 seeds has the following cdf: X 0 1 2 3 4 5 F(x) 0.1 0.2 0.3 0.5 0.8 1 Questions follow….. Example 2 X 0 1 2 3 4 5 F(x) 0.1 0.2 0.3 0.5 0.8 1 A) What is the probability that less than 2 seeds germinate? Example 2 X 0 1 2 3 4 5 F(x) 0.1 0.2 0.3 0.5 0.8 1 B) What is the probability that exactly 2 seeds germinate? Histogram Analyses When we look at a histogram, what things do you think we are interested in?? Expectation and Variance Notation n– N– Equiprobable – Populations Vs Samples Pictorially: Parameters Statistics Populations vs Samples Example: Expectation Formula: Expectation Concept: e.g. I roll la die 3 times and get 3, 2, 4 a) What is the sample mean? b) What is the expected value? Variance Formula: Variance Concept: e.g. I roll la die 3 times and get 3, 2, 4 a) What is the sample variance? b) What is the long run variance? Question In a game of chance the outcomes are -1, 1, 3 with probabilities 0.1, 0.3 and 0.6. Which outcome is most probable? A) -1 B) 1 C) 2 D) 3 σ vs s If σ is the population standard deviation and s is the sample standard deviation then… A) σ=s B) σ>s C) σ<s D) σ=s Properties of Variance and Mean I want to show you how a change to your data (i.e. multiply the values by 2.5), affects our statistics. A Silly Math Proof Let X represent your ‘data’…. Properties of Expectation Properties: 1 E + =( ) c ) (cE+ X) X Proof by example: X c X+c 1 2 3 2 2 4 3 2 5 You told your apprentice to measure something in cm’s but your apprentice is always off by 2 cms! How do you fix your mean??? Properties of Expectation Properties: 2 E) c ) (= c Proof by example: Properties of Expectation Properties: 3E ) cX ) ( = () cE X Proof by example: X c Xc 1 2 2 2 2 4 3 2 6 You told your apprentice to measure something in cm’s but your apprentice is always off by a multiple of 2! How do you fix your mean??? Basil, join the Dark Side!!! Properties of Variance 1)Var (c) = 0 Properties of Variance 2)Var (cX ) = c 2Var ( X ) ← proof usingst.dev. Properties of Variance 3)Var (c +X) = Var (X) ← proof by argument picture e.g. The number of cubs born to polar bear mothers given year is denoted by r.v. X x 0 b) Find µ. 0.1 2 Find c. 0.3 1 a) Pr(X=x) c in a c) Find σ2. 0.3 1 0.1 2 Find the st. dev. Pr(X=x) 0 d) x c e) f) Find the probability a polar bear has less than 2 cubs. Find the probability a polar bear has more than 2 cubs. x Pr(X=x) 0 0.3 1 0.1 2 c Example The temperature (in Celsius) in Ontario in August was: Temp 30 32 33 Frequency 5/30 15/30 10/30 Let X denote the temperature. Find E(X) and Var(X). Example Temp 30 32 Frequency 5/30 E(X) = 33 15/30 10/30 In Fehrenheit Let Y be the temperature in fehrenheit. If the relationship between celsius (X) and fehrenheit is 9X/5+32 = Y. Find E(Y) and Var(Y). Example Temp 30 32 Frequency 5/30 Var(X) = 33 15/30 10/30 A New Tool: Factorial Notation n! = n factorial = n(n-1)(n-2)…(2)(1) e.g. 3! = A) 3 C) 5 B) 6 D) 12 Arrangements Interpretation Factorial, Special Case We define n to be an integer greater than or equal to zero. What is 0!=? What is 1!=? A New Tool: The Choose Function Suppose I want to select 2 objects from 3. For example, I want to select two letters from the word BIO. Order does not matter. Hence if I select BI or IB, I do not count this twice. In how many ways can I do this (order does not matter)? A) 1 B) 2 C) 3 D) 4 Choose Function Notation The notation is: The formula is: Choose e.g. 2 5 /) 3( )1 54 ( )2() 5 ! ( / / =0 = 1 = // / 3 !5 3 ! 32()[() 3( − ) [( )1]21] Choose Function – Special Cases N choose 1 N choose 0 N choose N Binomial Distribution Example In how many ways can I toss a coin 3 times and get 2 heads? A) 2 B) 3 C) 4 D) 5 E) None of the above. What is the probability of HHT? A) (0.5)2(0.5) B) (0.5)(0.5)2 C) 1/8 D) 12.5% E) All of the above. Putting the last two slides together…? What is the probability of getting 2 heads and 1 tail in 3 flips? In general….????? BINOMIAL The above is an example of a binomial probability function. An experiment where: T I M S Formula and Notation Expectation and Variance in Formula Expectation by Example If I flip a fair coin 10 times, how many heads do you expect to get? A) 3 to 8 B) 5 C) Depends on information not given D) 0.5 Review BINOMIAL Formula: Concept: Pr ( X = x ) = n C x p x (1 − p ) n− x A binomial probability function occurs when we have an experiment that follows: T Two outcomes e.g. Pass a course, fail a course e.g. Heads, tails e.g. 0 or 1 I Independent trials e.g. There is no chance trial one will affect trial 2. M Multiple trials e.g. We flip a coin more than once S Same probability of success Notation: n p x Expectation & Variance - number of trials probability of a “success” # of successes, you see. n n −x E ( X ) = ∑ Pr ( X = x ) = ∑ x p x (1 − p ) = np x V ( x ) = ∑( x − µ ) Pr ( X = x ) = σ 2 2 e.g. Mendel Genetics It is known that when you cross a black rabbit with a brown, 10% of the progeny are mottled. Find the probability that in a litter of 5 rabbits, a) 3 are mottled b) At most 1 is mottled Experiment You don’t need to write this down… The Experiment You will write a multiple choice test involving 4 questions on Bio-Chem questions. Let’s see how well you do…(there are no marks for being right)… Keep track of your answers. Class Section 1 Question 1 My middle name is? A) Randolf B) Anthony C) David D) Pierce E) Adam Question 2 One of my sons middle names is: A) Xiao B) Cinder C) Tae D) Felix E) Mike Question 3 My daughters age is: A) 3 B) 5 C) 7 D) 9 E) You don’t have a daughter. Question 4 My original degree was in: A) Pure Math B) Applied Math C) Combinatorics and Optimization D) Actuarial Science E) Operations Research Answers 1. 2. 3. 4. Be honest, mark yourself out of 4. Number Correct A) 0 B) 1 C) 2 D) 3 E) 4 Theoretically???? Poisson Distribution POISSON 2 Formulas: Notation: Expectation and Variance in Formula A note on Mu or Lambda Poisson Example 1 Sunspots appear according to a poisson process, on average 5 times a year. In thirteen years, how many sunspots would we expect? A) 5 B) 13 x 5 C) 13/5 D) 13 E) None of the above Poisson Example 2 I have a 4 year old. Stickers randomly appear on the walls of my house at a rate of 4 per square meter. In 2 square meters, how many should I expect? A) 2 B) 4 C) 8 D) 16 E) None of the above Poisson Concept I I H e.g. An employee checks his email 3 times per 5 minutes. a) In five minutes, what is the probability the employee checked his email 7 times. 2% e.g. An employee checks his email 3 times per 5 minutes. b) In ten minutes, what is the probability the employee checked his email 7 times. 14% e.g. An employee checks his email 3 times per 5 minutes. c) In two minutes, what is the probability they check their email less than 2 times? 66% Poisson Vs. Binomial 1. 2. A= Poisson, B=Binomial, C=Other Fish travel upstream is schools of, on average, 24 fish. Schools of fish appear, on average, every 4 minutes. What is the probability that 3 schools of fish appear in 12 minutes? A= Poisson, B=Binomial, C=Other In a particular production process 12 widgets are placed in a shipping box. To determine if a box contains all 12 widgets, the weight is obtained. The probability that a widget is missing is independently 0.22. What is the probability that 2 are missing from a box? A= Poisson, B=Binomial, C=Other Cars on the highway arrive at KW according to a poisson process. On average 50 cars arrive per hour. The probability that an hour is “heavy” in traffic is 0.1. If hours are disjoint, what is the probability that in 12 hours, 6 are “heavy” in traffic? e.g. The number of deer found in a 1 acre plot appear homogenously, independently and individually. Usually I see two a day. a) Find the probability I see 4 in the day? 4% e.g. The number of deer found in a 1 acre plot appear homogenously, independently and individually. Usually I see two a day. b) Find the probability I see 3 in the day? A) 18% B) 22% C) 36% D) 54% E) None of the above e.g. The number of deer found in a 1 acre plot appear homogenously, independently and individually. Usually I see two a day. c) Find the probability I see less than 4. 86%. e.g. The number of deer found in a 1 acre plot appear homogenously, independently and individually. Usually I see two a day. d) Find the probability I see more than 4. A) 9% B) 86% C) 14% D) 91% e) If seeing 4 deer in a day is called a “quad”, find the probability I see 3 quad in 10 days. I can see at most one quad in a day. (4%) Example Co-op is always investigating student hiring. On average 10 students are hired per day. Students are hired independently. A) In a work week, find the probability that 60 students are hired. Example Co-op is always investigating student hiring. On average 10 students are hired per day. Students are hired independently. B) What is the probability that a dozen students are hired in 1 day? Example Co-op is always investigating student hiring. On average 10 students are hired per day. Students are hired independently. C) What is the probability that, in one work week, there are 3 days of exactly a dozen students employed? Continuous Random Variables Cts Case: Consider the continuous random variable X then 1) 2) The area beneath the curve f(x) is 1. f(x) = 0 f(x) is called the probability density function. f(x) is not a probability. Properties of Discrete Probability Function Properties of Continuous Probability Function A Note on Equality and Probability The Discrete Cumulative Distribution Function F(x) The Continuous Cumulative Distribution Function F(x) BIGGEST HINT DRAW PICTURES Pr(X<a) Pr(X>a) Pr(a<X<b) Pr(X<=a) vs. Pr(X<a) A) Pr(X<=a) is larger B) Pr(X<a) is larger C) Pr(X<=a)=Pr(X<a) D) none of the above Example e.g. Consider the density f(x)=3x2 for x from 0 to 1. A) Sketch f(x). Note: F(x) = x3. Example Continued e.g. Consider the density f(x)=3x2 for x from 0 to 1. B) What is the probability that x is less than 0.25? Example Continued e.g. Consider the density f(x)=3x2 for x from 0 to 1. C) What is the probability that x is equal to 0.25? Example Continued e.g. Consider the density f(x)=3x2 for x from 0 to 1. D) What is the probability that x is greater than 0.75? Normal Curves At the start of the course we could divide the data into 1) Discrete 2) Continuous Random variables can also be divided into these 2 groups. Mean & Variance Normal Curves as X is Gaussian with AlternativelyWe may write • Picture:mean µ and st. dev. σ. The concept is exactly the same. Picture • Pdf: Cdf: Because the formula is ugly we tend to use a Formula table to calculate our probabilities • • Used where? The Abonormal Curve???? The Standard Normal • The standard normal is N(0,1). It has a mean of 0 and variance of 1. Normal Distribution TRICKS • The normal distribution is SYMMETRIC! Because of this..... Pr(Z<-a) = Why do we need a table??? Reading the Table Let z=a.bc i.e. 1 . 2 3 1. (i.e. a is the first digit, and bc are the next two after the decimal). 2. Look up, in column 1, a.b 4. Look up, in row 1, 0.0c 4. The intersection is the Pr(Z<a.bc) Tables Used in Course Example Find the Pr(Z<0.25) Example The probability that Z~N(0,1) is less than 0.63. A) 0.63 B) 0.7357 C) 0.2643 D) 0.37 Example The probability that Z~N(0,1) is less than or equal to 0.63. A) 0.63 B) 0.7357 C) 0.2643 D) 0.37 Example The probability that Z~N(0,1) is more than 0.63. A) 0.63 B) 0.7357 C) 0.2643 D) 0.37 Example The probability that Z~N(0,1) is more than -0.63. A) 0.63 B) 0.7357 C) 0.2643 D) 0.37 Example The probability Z~N(0,1) is less than 0 is: A) 0 B) 1 C) 0.5 D) Can't Determine Example Find Pr(0.23<Z<0.46). Example Let Z~N(0,1). The probability that Z is less than 0.5? A) 0.5199 B) 0.6915 C) 0.7088 D) 0 1. Example Let Z~N(0,1). 2. The probability that Z is less than -1.46? A) Cannot calculate B) 0.0721 C) 0.9279 D) 0 Example The probability that Z is between -1.46 and 0.5 is: A) 0 B) 0.6194 C) 0.6915 D) 0.9279 3. Standard Normal Curve 1 Standard Deviation from 0 Interpretation and Relationship to Non Standard Normal Standard Normal Curve 2 Standard Deviations from 0 Standard Normal Curve 3 Standard Deviations from 0 Standard Normal Curve Six Sigma and Range Normal Calculations Calculating normal probabilities involves converting the N(µ,σ2 ) το Ν(0 ,1 ). Z Score Z score transforms a r.v. X~ N(µ,σ2 ) το Ζ∼ Ν(0 ,1 ). The transformation is: Z= Proof Goal N(µ,σ2 ) Ζ ∼ Ν(0 ,1 ) N(µ,σ2 ) Ζ∼ Ν(0 ,1 ) N(µ,σ2 ) Concept Consider the data: 1,2,2,3,3,3, 4,4,4,4,5,5, 5,5,5,6,6,6,6, 7,7,7,8,8,9 The sample mean is 5 and standard deviation is approximately 2. Graphically Transforming Transform x=1 by the z score function using mean 5 and standard deviation 2: Transforming 2 Transform x=6 by the z score function using mean 5 and standard deviation 2: CLICKER The answer is: A) -0.5 B) 0 C) 0.5 D) 1 E) None of the above Transformed Data In the rest of the cases: -2.0, -1.5, -1.5, -1.0, -1.0, -1.0, -0.5, -0.5, -0.5, -0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5, 0.5, 0.5, 1.0, 1.0, 1.0, 1.5, 1.5, 2.0 The mean of the transformed data is 0 with standard deviation 1. Graphically In words…. Example (Storyless) Let X~N(60,4). What is the probability X is more than 58? Method 1. 2. 3. IF we are given Z~N(0,1) goto 3. IF we are given X~N(µ,σ2 ) then transform and go to 3. Calculate as before. Example The income of the average Canuck is normally distributed and is on average 30 thousand dollars a year with standard deviation 10 thousand. What is the probability a randomly selected Canadian makes more than 50 thousand dollars? St.Dev=10, Mean=30, X>50 Example The income of the average Canuck is normally distributed and is on average 30 thousand dollars a year with standard deviation 10 thousand. What is the probability a randomly selected Canadian makes between 20 and 40 thousand? St.Dev=10, Mean=30, 20<X<40 Review The process… 1. If X~N(µ,σ2 ). Transform using Z score. 2. When Z~N(0,1) make sure your probabilities are Pr(Z<a) where a is positive. 3. If a is negative use symmetry. 4. If ‘<‘ is ‘>’ use ‘1-’ 5. Look up your probabilities in the table. Reading the Normal Table Backwards Reading BACKWARDS Instead of the probability, I want the z score. 1. Let p be the probability of interest. 2. Look for it in the table 3. Look up the corresponding column, 0.0c and row a.b 4. The intersection is the Pr(Z<a.bc) Z score…In reverse…. X~N(µ,σ2 ) Z ~ N(0,1) N(µ,σ2 ) Ζ∼ Ν(0 ,1 ) N(µ,σ2 ) Example Find z, Pr(Z<z)=0.75 To Interpolate or Not… Example The Pr(Z<z)=0.7357. What is z? A) 0.63 B) 0.47 C) 0.53 D) 0.62 Example Find z, Pr(Z<z)=0.25 …Work… Example Find the x value (X~N(3,9)) when Pr(X<x)=0.6915. Example The income of the average Canuck is normally distributed and is on average 30 thousand dollars a year with standard deviation 10 thousand. What is the income such that 95% of Canadians make less than this amount? …Work… Example The income of the average Canuck is normally distributed and is on average 30 thousand dollars a year with standard deviation 10 thousand. What is the upper and lower bounds such that the middle 95% of Canadians make between these amounts? …Work… …Work… Central Limit Theorem (CLT) Let Xi be a random variable with mean µ and variance σ2 If we have n of these Xi’s and they are all independent then 1. The mean: 2. The sum: Class Example How many brothers and sisters do you have? A) 0 B) 1 C) 2 D) 3 E) 4 Class Example Continued With the people around you (our ‘random’ sample), take your answer(s) to the last question and average them. All of you can answer the following: Which number below is closest to your average A) 0 B) 0.5 C) 1 D) 1.5 E) 2 What did you see? Proof Proof Proof Central Idea: CLT It takes everything and makes it normal… Except The CLT In Words Logic Behind Reduced Variance: Consider the data 4, 2, 10 Our variance is relatively ‘large’ meaning we are far from the center. Logic Behind Reduced Variance: The average of 2 values is: Variance has decreased – values are closer to the mean. Example 1 Students in this course have a mean age of 19 with a standard deviation of 4. Assume ages are normally distributed. A) What is the probability that a randomly selected student is younger than 20? Example 1 Students in this course have a mean age of 19 with a standard deviation of 4. Assume ages are normally distributed. B) What is the probability that the average age of a group of 9 randomly selected students is younger than 20? CLT What is the Central Limit Theorem??? HARD Clicker Test Question For sufficiently large n, we see that the mean of our data is normally distributed. What is the distribution of our original data?? A) Normal B) Poisson C) Binomial D) We are uncertain Review A) Set up the probability. B) Given 1. Mean – Standardize using: 2. A single value – Standardize using: 4. Total - C) At which point you will have a Z…so convert to a probability but beware: 1) The > symbol 2) Negative values 3) Probabilities less than 50% Example The midterm average was 75 with standard deviation 5. The grades are normally distributed. What is the probability that… A) A randomly selected person has a grade more than 80? B) The average of 25 randomly selected people’s grades is more than 80? Example X~N(75,52) A) A randomly selected person has a grade more than 80? Example X~N(75,52) B) The average of 25 randomly selected people’s grades is more than 80? Binomial Approximation Goal…to approximate the binomial with a normal distribution! RECALL! Standardization Recall: Let X ~ N(mu,sigma2) Then RECALL! Binomial Recall: Let X have a binomial distribution. Then E(X) = Var(X) = Binomial to Normal Thus we can approximate a binomial random variable by a normal random variable: X ~ Binomial (n, p) Is approximately: Counts vs Proportions Example I flip a coin 100 times (yep, I’m bored). What is the probability I get more than 55 heads? Method 1 – Using the binomial (Chuck Norris seconds!) could do this in 2 (13.56) Example I flip a coin 100 times (yep, I’m bored). What is the probability I get more than 55 heads? Method 2 – Using the normal approximation to the binomial… When do we approximate!? This approximation is: - Best when n is really large and p is either really small or really large. - Performed when, as in the last example, we are looking for what could be a lot of terms. PROBLEM!!! The normal is continuous whereas the binomial is discrete. There are issues with approximating a binomial with a normal. Consider: X~Bin(10,0.3) The exact probability that X is less than 2 is: The approximate probability that X is less than 2 is: Graphically – Binomial vs Normal 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Graphically – Overlap 0 1 2 3 4 5 6 7 Normal Probability Questions Confusion arises regarding WHEN to apply a continuity correction. The answer is: We apply it when the underlying distribution is BINOMIAL (or more to the point, discrete). Examples In each of the cases below, is the underlying distribution binomial or not? If it is apply a continuity correction. Key Words (not black and white): Apply Continuity – Count, Proportion Do Not – Average, mean, total Question 1 It is known that, of 500 people in a class, 230 are male. What is the approximate probability that the number of males selected at random for a sample of size 50 is more than 25? A) Apply Continuity Correction B) Do not apply Continuity Correction Question 2 It is known that the height of males is on average 72 inches with a standard deviation of 3 inches. What is the probability that 5 randomly selected males are on average more than 75 inches? A) Apply Continuity Correction B) Do not apply Continuity Correction Question 3 Of 50000 applicants to American Idol, we believe 500 can sing. What is the approximate probability that in a sample of 10000 people over 10% can sing? A) Apply Continuity Correction B) Do not apply Continuity Correction Continuity Correction To ensure that the approximation is more accurate we make, what is called a continuity correction. This implies we add or subtract an amount from the value of X to get a better approximation for the probability. E.g. Next slide using X~Bin(10,0.3) Examples In the examples below, normal areas will be ___________ and binomial areas will be HIGHLIGHTED. Graphically – Pr(X<2) 0 1 2 3 4 5 6 7 Graphically – Pr(X<=2) 0 1 2 3 4 5 6 7 Graphically – Pr(X=2) 0 1 2 3 4 5 6 7 Pr(a<X<b) If X is binomial then the continuity correction for X should be… A) Pr(a+0.5<X<b-0.5) B) Pr(a+0.5<X<b+0.5) C) Pr(a-0.5<X<b-0.5) D) Pr(a-0.5<X<b+0.5) E) None of the above. Pr(a<X<=b) If X is binomial then the continuity correction for X should be… A) Pr(a+0.5<X<b-0.5) B) Pr(a+0.5<X<b+0.5) C) Pr(a-0.5<X<b-0.5) D) Pr(a-0.5<X<b+0.5) E) None of the above. Pr(a<=X<=b) If X is binomial then the continuity correction for X should be… A) Pr(a+0.5<X<b-0.5) B) Pr(a+0.5<X<b+0.5) C) Pr(a-0.5<X<b-0.5) D) Pr(a-0.5<X<b+0.5) E) None of the above. Binomial Approximations with Counts Definition: Example I flip a coin 100 times (yep, I’m bored). What is the probability I get more than 55 heads? Approximate using a continuity correction… recall… Exact = 13.56 Approximate = 15.87 (no continuity correction) Continuity Correction Example Let X~Bin(10,0.3). Find Pr(X<2). From last class we saw: 1. The exact probability is 13.56% 2. The approximate probability without a continuity correction is 15.87% 3. The approximate probability WITH a continuity correction is __________ Binomial Approximations with Proportions Definition: Example (Without Continuity Correction) The Leafs have won 1 game in 9 attempts. What is the approximate probability they will win more than 50% of the of the next 20 games they play? Example (WITH Continuity Correction) The Leafs have won 1 game in 9 attempts. What is the approximate probability they will win more than 50% of the of the next 20 games they play? Example (As a COUNT!) The Leafs have won 1 game in 9 attempts. What is the approximate probability they will win more than 50% of the of the next 20 games they play? WHERE NEEDED WE ALWAYS ALWAYS ALWAYS USE A CONTINUITY CORRECTION!!!!! Populations Target Population: Populations Study Population: …and Parameters Parameter: Samples Sample: … and Statistics Statistics Estimates Example A study of smoking and lung cancer involved giving cigarets to mice and watching their health over time. Question What is the target population? A) Mice B) People C) Unknown What is the sample? D) Mice E) People F) Unknown Example A study from 1957 to 1964 in Albany (NY) and Montreal involving brainwashing and LSD. The main researcher, Dr. Ewen Cameron was hired by the CIA to determine whether or not LSD could be used to brainwash people. To test these claims he used psych patients (non-volunteers) with minor issues. Question What is the target population? A) People B) Psych Patients C) Unknown What is the study population? D) People E) Psych Patients F) Psych Patients from 1957-1966 G) Unknown You WILL LOVE STATISTICS You WILL LOVE STATISTICS You WILL LOVE STATISTICS ERRORS Study Error - ERRORS Sample Error - ERRORS Measurement Error - Where are the errors? Graphically… Parameters and Statistics Goal of Statistics: MAGIC! Example Consider the group of people around you. What is your best guess for their age? The Actual Answer Using clickers the instructor will now obtain the classes average age: Example continued Using clickers, were you right (A) or wrong (B)? MAGIC! Your instructors attempt to guess the classes age….? CI’s The Confidence Interval (C.I.) Goal: To build an interval for µ using our sample data. Assume: µ is unknown, but σ2 known Confidence Level (C.L.) The confidence level, or CL is the level of confidence I have that my parameter is in the interval. Driving Example Recall the standard normal… How much of the data is 1 standard deviation from the mean? How much of the data is 2 standard deviations from the mean? How much of the data is 3 standard deviations from the mean? Graphically… Confidence Interval A confidence interval does the same thing where; 1. The center is the middle of the distribution 2. The C.L. tells me how many standard deviations we should be from the mean, denoted by c. 3. The standard error is the standard deviation of the value of interest. In General, If the distribution was… A~N(B,D) The confidence interval is: Where…we estimate the unknowns… Specifically… The distribution of the mean is: Hence the confidence interval is: Steps 1. 2. 3. From the C.L., determine c. Determine the sample mean (our estimate) and our variance. Plug them into the formula. C????? We determine c by using the confidence level (CL). We begin by assuming that the confidence level is the middle probability about the mean. i.e. We want Pr(-c<Z<c)=CL Pictorially: Example for c If the CL is 90%, then c is: Example for c If the CL is 95%, then c is: A) 1.28 B) 1.645 C) 1.96 D) Cannot Calculate Example for c If the CL is 89%, then c is: A) 1.28 B) 1.6 C) 1.96 D) Cannot Calculate Example The weekly income of a 1A co-op student was a question on the mind of a student in grade 12. She knew that σ2 was 12000 dollars2 and that a random sample of 16 of her friends had an average income of 1000 dollars. What would a 95% confidence interval be in this case? We’ll do this question in parts…. Step 1– What is z and –z given the CL is 95%? Recall that Z~N(0,1), so we want Pr(z<Z<z)=.95. A) 1.64 B) 1.65 C) 1.96 D) 2 Step 2 The important numbers/statistics are: The confidence interval is: Recall CIs Example Suppose we were looking at the average age of students in class and found it was, for 7 students on average19 with population standard deviation of 2 years. Build a 95% CI. Steps 1 1. 2. 3. Find your diamonds (CL, mean, st. dev etc) Draw a picture for your CL – to find c. Fill the values into your CI. Step 1 Find your diamonds (CL, mean, st. dev etc) Suppose we were looking at the average age of students in class and found it was, for 7 students on average19 with population standard deviation of 2 years. Build a 95% CI. Step 2 Draw a picture for your CL – to find c. Suppose we were looking at the average age of students in class and found it was, for 7 students on average19 with population standard deviation of 2 years. Build a 95% CI. Step 3 Fill the values into your CI. Suppose we were looking at the average age of students in class and found it was, for 7 students on average19 with population standard deviation of 2 years. Build a 95% CI. CI Estimate +/- (table value) x (standard error) Example What germs are on our hands? Objective: This experiment will show the bacteria that normally are found on our hands from daily activities. Materials for each student: Two petri plates containing agar at room temperature. The maximum number of plates provided is 50. Procedure: 1. Each student needs two agar plates . Have the student write his/her name on both plates and also label one plate "before hand washing " and the other plate "after hand washing". 2. Students should NOT wash hands prior to this experiment and they SHOULD touch objects about the room as they normally would during the day. Procedure 2: 3. Instruct each student to open "before hand washing " petri dish and run fingers gently across the surface of the agar with unwashed fingers being careful not to tear agar. It is important to have them gently rock fingers so that nails make a light imprint into the agar. Procedure 3: • • 4. Close first plate and using proper hand washing technique wash hands and have them repeat the process on second petri dish. That plate should be labeled with name and "after hand washing". 5. Incubate the plates inverted (agar in top) at 35C or room temperature until the next period. (Usually 24 hours at 35C or 48 hours at room temperature.) Procedure 4: 6. Record results: a. 4+ = maximum growth b. 3+ = moderate growth c. 2+ = some growth d. 1+ = a little growth e. neg = no growth 7. Compare colonies to Bacterial Growth Chart and have students compare number of bacterial colonies before hand washing to after hand washing. Data and Statistics Consider the plates resulting from the washed hands only. It is known that the population variance is 1.3. The mean bacteria growth on the 50 petri dishes’ is 1.9. Build a 99% confidence interval for the mean bacteria growth for those people in the population who have washed their hands. Diamond Mining It is known that the population variance is 1.3. The mean bacteria growth on the 50 petri dishes’ is 1.9. Build a 99% confidence interval for the mean bacteria growth for those people in the population who have washed their hands. Clicker We should look up, A) B) C) D) 99% 99.5% 98.5% None of the above. 99% Confidence Interval Confidence Interval Example Using Clickers 1. The study population (SP) of interest is those people who are in class today. The SP mean is: Confidence Interval Example The SP variancesing Clickers U is: Confidence Interval Example Using Clickers Normally…We would NOT know the SP mean. Confidence Interval Example Using Clickers 2. Find a group of 3 people (including yourself). Calculate the average age of the people in your group. Confidence Interval Example Using Clickers 3. Find c for a 95% Confidence Interval. A) 1.645 B) 1.65 C) 1.96 D) 2 Confidence Interval Example Using Clickers 4. Calculate the confidence interval using: Confidence Interval Example Using Clickers Nomenclature… Let the confidence interval be (L,U). L is the lower end of the interval and U is the upper end of the confidence interval. Define the width to be W=U-L. Confidence Interval Example Using Clickers 5. Class Examples: L U Is the parameter in the interval? Confidence Interval Example Using Clickers 6. A) B) Was the SP mean in your interval? YES NO Confidence Interval Example Using Clickers Summary… Interpretation We say: We mean: A CI is NOT a probability! Blocks Lab Each group 10 blocks at random, and calculated an average for those 10 blocks. We build a CI (95%) for each random selection. We count the proportion of times 32 is in the interval. CI Width Width Recall: We let the upper confidence limit be U and the lower confidence limit be L. The width, W=U-L. U= L= --------------------------W= CI Width c Assuming all else remains equal, if n increases, what happens to W? A) B) C) W increases W decreases W stays the same Example Assume the mean is 5, sigma = N=9 N=16 2 and c = 1.96 CI Width c Assuming all else remains equal, if c increases, what happens to W? A) B) C) W increases W decreases W stays the same Example Assume the mean is 5, sigma = C=1 C=2 2 and n = 9 CI Width c Assuming all else remains equal, if the CL increases, what happens to W? A) B) C) W increases W decreases W stays the same Example Assume the mean is 5, sigma = CL = 90% CL = 95% 2 and n = 9 CI Width c Assuming all else remains equal, if sigma increases, what happens to W? A) B) C) W increases W decreases W stays the same Example Assume the mean is 5, sigma Sigma = 2 Sigma = 3 = ? C=2 and n = 9 CI Width c Assuming all else remains equal, if the sample mean increases, what happens to W? A) B) C) W increases W decreases W stays the same Example Assume the mean = ?, sigma = 2 C=2 Mean = 5 and n = 9 Mean = 10 Time and Money When asking a statistical question we could simply get the largest sample imaginable … OR… we could find the smallest that will do the job. Hopefully this would save time and money. Process 1. 2. 3. 4. Make a pilot study. From the pilot study determine s. Use s to find n. Make a study for the n units. Sample Size Calculations The Width of our confidence interval is: Rearranging for n gives: Example Problem: To determine the strength of a particular glass. Plan: Break 10 panes of glass (EXPENSIVE). Measuring the tensile strength of the glass. Data/Statistics: For the 10 panes of glass it was determined that the mean tensile strength was 50 with standard deviation 5. What should the sample size be if we want to be accurate to +/- 0.1 units, 19 times out of 20? Example Problem: To determine the amount of toxin in a particular field. Plan: 5 holes are dug and the toxin levels are measured. Data/Statistics: For the 5 holes the toxin levels had a mean of 82 and variance of 4.2. What should the sample size be if we want to be accurate to +/- 0.5 units, 90% of the time? Answer A) 45 B) 46 C) 47 D) None of the above. What is (are) the problem(s) with these CIs? CI if σ2 is Unknown If we don’t know σ2 then we need to estimate it. Our estimate of σ2 is s2. HOWEVER, in estimating σ2 , we no longer have a normally distributed random variable. Instead we have a student t. What is (are) the problem(s) with these CIs? The Student t Distribution Degrees of Freedom and Shape Reading the t Table 1. Let df be the degrees of freedom. Look up the degrees of freedom in the first column. 2. Let the critical value (i.e. z score from a normal table) be c. Look for this number in the row indicated by 1. 3. Look to the intersecting columns. The first row gives the desired probability, p. i.e. p=Pr(C<c) Tables Used in Course Example Find the Pr(T<0.90) if T has 5 degrees of freedom. Example Find the Pr(T<3) if T has 6 degrees of freedom. Example Find the Pr(T>3) if T has 6 degrees of freedom. A) 1% B) 2% C) 1 to 2.5% D) 97.5 to 99 % Example The probability that T on 8 degrees of freedom is equal to 0.63. A) 0.63 B) 0.7357 C) 0.2643 D) 0.37 E) None of the above • READING T BACKWARDS 1. Let df be the degrees of freedom. Look up the degrees of freedom in the first column. 2. Let p be the probability of interest. Look up the probability in the first row. 3. Look to the intersecting row and columns for the critical value c. i.e. p=Pr(C<c) Example Df = 8 and p = 0.975 = Pr(C<c). Then the critical value is: A) 2.31 B) 2.36 C) 2.26 D) 1.86 Example Df = 8 and p = 0.95 = Pr(-c<C<c). Then the critical value is: A) 2.31 B) 2.36 C) 2.26 D) 1.86 Degrees of Freedom and Shape and the Table… RECALL = CI if σ2 is Unknown The confidence interval is: Where c is: And the degrees of freedom are: Digression – Degrees of Freedom Consider the data x, y and z. Let x = ____________ Let y = ____________ Let the sample mean be = _______________ Digression – DF continued Example The weekly income of a 1A co-op student was a question on the mind of a student in grade 12. She knew that s2 was 12000 dollars squared and that a random sample of 16 of her friends had an average income of 1000 dollars. What would a 95% confidence interval be in this case? An appropriate conclusion… Tough Questions…1 The interval from the last example was: This means that the population mean wage is in this interval. A) Yes B) No C) We don’t know Tough Questions…2 The interval from the last example was: This means that the sample mean wage is in this interval. A) Yes B) No C) We don’t know Tough Questions…3 The 95% confidence interval from the last example was: This means that the probability our parameter is in the interval is 95%. A) Yes B) No C) We don’t know RECALL Example The weekly income of a 1A co-op student was a question on the mind of a student in grade 12. She knew that σ2 was 12000 dollars squared and that a random sample of 16 of her friends had an average income of 1000 dollars. What would a 95% confidence interval be in this case? The interval was: (946,1054) Comparison The interval with sigma known: (946,1054) The interval with sigma unknown: What do you notice and why? Example Actuaries study insurance. They are interested in the age at death of a person because the ability to predict such a thing can help in determining the price of a policy. If 12 people are observed to have an average age of death of 73 years with a variance of 4 years, what is the 99% confidence interval for the mean age at death? Confidence Interval RECALL The CI is: Estimate +/- c(Standard Error) Example: The Mean… CI for a Mean - Basic Assumptions The mean is unknown 2. The sample is obtained randomly and independently 3. The CLT applies to make the mean normally distributed Note: If sigma is unknown we use a t table. If sigma is known we use a normal table. 1. Confidence Interval for a Proportion Recall: We can approximate a proportion by… Hence, the CI is: For c we always use a _____________ table. Example A sample of 500 nursing applications included 60 from men. Find the 90% CI of the true proportion of men who applied to the program. An appropriate conclusion (and as it would be stated in a newspaper) Example Of the 1000 people surveyed by Statistics Canada last week 182 were looking for a job. Build a 99% confidence interval for the countries unemployment rate. An appropriate conclusion (and…as stated in a newspaper) CI for a Proportion – Basic Assumptions 1. We don’t know the proportion. 2. The sample is obtained randomly and independently 3. The proportion is approximately normally distributed A Mathematical Interlude Properties of Expectation Let a be a constant and X a r.v. then 1. E(aX) = aE(X) 2. E(X+b) = E(X) + b 3. E(X+Y) = E(X) + E(Y) Why? E(X+Y) = E(X) + E(Y) Example: Let X be the time it takes to go to UW from WLU and Y the time it takes to go to WLU from UW. How long on average does it take me to make the circuit? Properties of Variance Let a be a constant and X a r.v. then 1. Var(aX) = aVar(X) 2. Var(X+b) = Var(X) 3. IF X and Y are independent then Var(X+Y) = Var(X) + Var(Y) Explaining the Sampling Distribution of the Mean Explaining the Sampling Distribution of the Proportion End of the Interlude Hypothesis Tests (HT) An Introduction HTs Vs CIs Both are…. CIs are… HTs are…. Class Brainstorm What things, concepts etc make up a court case? Hypothesis Tests (HT) The following is an allegory to help you understand HTs. The Cheat Model. Suppose for a second that there was some question about the honesty (μo) of a student (μ). Two possibilities exist. Either the student cheated (Ha: μ ≠ μo) or they did not (Ho: μ = μo). Clicker Question Should we treat the student as: A) B) innocent until proven guilty guilty until proven innocent As a result we assume (Ho: μ = μo) is true. The Cheat Model To make a decision we need to gather evidence (i.e. draw a sample). Our evidence includes means, standard deviations etc accumulated into the ultimate decision, a pvalue. On the basis of the evidence we must make a decision. What kind of decision? A) B) C) D) Guilty or Innocent Not Guilty or Innocent Guilty or Not Guilty None of the above Steps We will break the process of an hyothesis test into several steps: 1. 2. 3. 4. The Hypothesis The Formula The Pvalue The Conclusion (the charge!) (the evidence!) (the decision!) (formal description) The P Value (Step 3) P value I have a friend…yes, one… My friend has a fair coin that he starts to flip. He flips the coin once…. Two times…. Three times…. 10 Times… And gets tails every single time…. At what point do we begin to doubt the sincerity of our friend?????????? Probability What is the probability we see ten tails in a row…if the coin is fair?? p Value A p value is the probability that we see a value as extreme as our estimate if our parameter is what we hypothesize. In terms of our coin example… Significance Level The significance level is the point at which we decide that what we are seeing is too improbable for our hypothesis to be true. Typically it is about 5%. In other words, because it is so unlikely (<5%) that we would see 10 heads in a row… we argue that the hypothesis is wrong. Significance Level – 2 Views 1. Black and White - The question gives you a significance level, i.e. 5%. If pvalue < significance level then reject Ho. Significance Levels – 2 Views Grey If the pvalue is p then 2. 5% < p <= 10% there is some evidence against Ho 1% < p <= 5% there is lots of evidence against Ho < 1% there is tons of evidence against Ho Example The pvalue is 0.04. Do we reject or not? A) B) C) Yes. No. Uncertain from the given data. The Hypothesis (Step 1) The Hypothesis Ho – We say “H NOT”. We call this the Null Hypothesis Typically this is the “Status Quo”, what is typical. From our “cheat model” – - For the coin example – Ha We say “H-eh” We call this the alternative hypothesis. This is typically the hypothesis that is stated in the question. From our “cheat model” – - For the coin example – Ho vs Ha for a Mean Ho Ha Hypothesis Tests (HTs) HT - RECALL 1.a) HT vs CI 1.b) Allegory…. HT - RECALL 2. Pvalue: Tails Probability 10 0.1% 9 1.1% 8 5.5% 7 17.2% 6 37.7% 5 62.3% 4 82.8% HT - RECALL 3. Hypotheses Ho vs Ha for a Mean Ho Ha What do you notice???? A Special Note On Equality and Ho 2 Sided vs 1 Sided HTs Examples 1. A) B) C) D) The IQ is recorded of 20 students in Alberta. A scientist argues that because the sample average is 105, students in Alberta are more intelligent than the average (average = 100). (Ha: μ ≠ 100) (Ho: μ = 100) (Ha: μ > 100) (Ho: μ = 100) (Ha: μ < 100) (Ho: μ = 100) (Ha: μ =100) (Ho: μ ≠ 100) Examples 2. The typical dog lives to be 14 years old. A certain breeder loves weiner dogs and notices that 18 pups have lived for, on average, 18 years. Does the age of a weiner dog differ from that of the typical dog? A) (Ha: μ ≠ 14) (Ho: μ = 14) B) (Ha: μ > 14) (Ho: μ = 14) C) (Ha: μ < 14) (Ho: μ = 14) D) (Ha: μ =14) (Ho: μ ≠ 14) Recall! The next 10 slides are semi-review. HW: We are in the week of Nov 8th. The Hypothesis Test Driving Example Problem: My friend tells me they have a fair coin. Plan: To test this theory he flips the coin 10 times. He gets a tail every time. I then record the probability of seeing at least x tails. i.e. Pr(X>=x) Distribution of X A) Binomial B) Poisson C) Normal D) t E) None of the above. The Hypothesis Ho: The coin is fair. Ha: The coin is NOT fair. HT - RECALL In theory, Pr(X>=x given the coin is fair): Tails Probability 10 0.1% 9 1.1% 8 5.5% 7 17.2% 6 37.7% 5 62.3% 4 82.8% The Fence (Significance Level) We argue that 5% is ‘small enough’. If an event is more rare than 5%, we reject our assumption. Where should the fence go…? Tails Probability 10 0.1% 9 1.1% 8 5.5% 7 17.2% 6 37.7% 5 62.3% 4 82.8% A) 6 B) 7 C) 8 D) 9 E) 10 Conclusion If we see 10 tails in a row, we A) Reject the coin being fair B) Not reject the coin being fair Formally…. An hypothesis involves 4 steps: 1. Hypothesis 2. Formula 3. P value 4. Conclusion Step 3. P value A p value is the probability that we see a value as extreme as our estimate if our parameter is what we hypothesize. Step 1. Ho vs Ha for a Mean Ho Ha Return to your regularly scheduled slides…. The Formula (Step 2) The Formula RECALL! Let X be N(10,25). If we want to find the probability that X is more than 15 then… What does the z score represent??? The Formula The formula used in hypothesis tests is always the same. The formula is: We call this a discrepancy. Typical Examples For a mean (sigma known), the sampling distribution is: Therefore the hypothesis formula is: Typical Examples For a mean (sigma unknown), the sampling distribution is: Therefore the hypothesis formula is: P value and Ha Ha P value The Conclusion (Step 4) The Conclusion In a court case which do we say…? A) B) C) D) E) Guilty Innocent Not Guilty Not Innocent C or A RECALL Steps We will break the process of an hyothesis test into several steps: 1. 2. 3. 4. The Hypothesis The Formula The Pvalue The Conclusion (the charge!) (the evidence!) (the decision!) (formal description) Hypothesis Tests Putting it all together… Pvalues, Court Cases etc… On Trial: The evidence: The decision: The conclusion: A Picture to Put it Together… Particular Recipe for HT Sigma Known 1. Determine the hypotheses 2. Use the formula: 3. Calculate a p value: 4. Make a conclusion Hypothesis Tests (HT) Methodology Clicker The following are possible hypotheses: A)(Ha: μ ≠ 100) (Ho: μ = 100) B)(Ha: μ > 100) (Ho: μ = 100) C)(Ha: μ < 100) (Ho: μ = 100) D)(Ha: μ =100) (Ho: μ ≠ 100) E)A, B and C Clicker A pvalue is: A)A probability we see an estimate this far from the hypothesis if Ha is true. B)A probability we see an estimate this far from the hypothesis if Ho is true. C)A probability we see an parameter this far from the hypothesis if Ha is true. D)A probability we see an parameter this far from the hypothesis if Ho is false. Clicker In the conclusion stage we never say: A)Reject Ha B)Accept Ho C)Accept Ha D)All of the above General Recipe for HT 1. Determine the hypotheses 2. Use the formula 3. Calculate a p value 4. Make a conclusion Particular Recipe for HT Sigma Known 1. Determine the hypotheses 2. Use the formula: 3. Calculate a p value: 4. Make a conclusion Example The number of hours of TV watched by American students per week is known to be 24 with a standard deviation of 3 hours. Canadian researchers believe that the same standard deviation holds for Canadians. Further, in a survey of 400 Canadian students the average TV hours watched per week was 23.8. Are Canadian researchers right in assuming that Canadian students watch less TV? Use a 5% level of significance to make your decision. Clicker Ha is (where A is a number): A) (Ha: μ ≠ A) B) (Ha: μ > A) C) (Ha: μ < A) D) (Ha: μ = A) The number of hours of TV watched by American students per week is known to be 24 with a standard deviation of 3 hours. Canadian researchers believe that the same standard deviation holds for Canadians. Further, in a survey of 400 Canadian students the average TV hours watched per week was 23.8. Are Canadian researchers right in assuming that Canadian students watch less TV? Use a 5% level of significance to make your decision. Step 1 - Hypothesis Step 2 - Formula Clicker The p value is: A) Pr(D>d) B) Pr(D<d) C) 2Pr(D>d) D) Pr(D=d) The number of hours of TV watched by American students per week is known to be 24 with a standard deviation of 3 hours. Canadian researchers believe that the same standard deviation holds for Canadians. Further, in a survey of 400 Canadian students the average TV hours watched per week was 23.8. Are Canadian researchers right in assuming that Canadian students watch less TV? Use a 5% level of significance to make your decision. Step 3 – P Value Clicker Therefore we: A) Reject the null B) Reject the alternative C) Accept the null D) Accept the alternative E) Do not reject the null Step 4 - Conclusion Particular Recipe for HT Sigma UNknown 1. Determine the hypotheses 2. Use the formula: 3. Calculate a p value: 4. Make a conclusion Example 8 Northern Ontario Lakes pH levels are measured. The average is 6.4 with a standard deviation of 0.7. The researchers suspect that based on this information, the pH level differs from neutral. Clicker Ha is (where A is a number): A) (Ha: μ ≠ A) B) (Ha: μ > A) C) (Ha: μ < A) D) (Ha: μ = A) 8 Northern Ontario Lakes pH levels are measured. The average is 6.4 with a standard deviation of 0.7. The researchers suspect that based on this information, the pH level differs from neutral. Step 1 - Hypothesis Step 2 - Formula Clicker The p value is: A) Pr(D>d) B) Pr(D<d) C) 2Pr(D>d) D) Pr(D=d) 8 Northern Ontario Lakes pH levels are measured. The average is 6.4 with a standard deviation of 0.7. The researchers suspect that based on this information, the pH level differs from neutral. Step 3 – P Value Clicker Therefore we: A) Reject the null B) Reject the alternative C) Accept the null D) Accept the alternative E) Do not reject the null Step 4 - Conclusion Example – Clicker Question There has been some talk in the media of Vitamin D (VitD) preventing colds (there is some biological logic to this…). A study was conducted of those who take VitD versus those that did not. All patients initially had no colds. The length of time before they got their first cold was recorded. Example Continued In the VitD group, of 14 people, the average length of time before their first cold was 23 days with a standard deviation of 12 days. A medical doctor suggests that the information is not significant unless the average number of days is greater than 25. The hypothesis tests are: A)(Ha: μ ≠ 25) (Ho: μ > 25) B)(Ha: μ > 23) (Ho: μ = 23) C)(Ha: μ < 25) (Ho: μ = 25) D)(Ha: μ > 25) (Ho: μ = 25 Example Continued In the VitD group, of 14 people, the average length of time before their first cold was 23 with a standard deviation of 12. A medical doctor suggests that the information is not significant unless the average days is greater than 25. The discrepancy is: A) -2.16 B) -0.624 C) 0.624 D) 2.16 E) None of the above. Pvalue The pvalue is [next two slides are tables]: A) 73.24% B) 25.76% C) 20% to 30% D) 70% to 80% Therefore we: A)Reject Ho B)Accept Ho C)Reject Ha D)Do not reject Ho E)Accept Ha Hypothesis Tests (HT) Errors Clicker Question In an hypothesis test of grades we are testing whether the class average is less than 75. We reject Ho. What does this mean…? We are 100% certain that… A) The class average is less than 75 B) The class average is more than 75 C) We cannot be 100% certain Errors Notice that we aren’t certain…hence it is possible that we are wrong. How?? Test→ Truth ↓ Test Rejects Ho Ho is TRUE Type 1 Error Ho is FALSE Test Does Not Reject Ho Type 2 Error Type 1 Errors We want to reduce the possibility of a type 1 error. In terms of our cheating case this means we want to reduce the probability that… Type 2 Errors Sadly we can only reduce Type 1 OR Type 2 errors, not both… Hence we purposely set the possibility of a type 1 error to nothing…sacrificing the possibility of a type 2 error. To Reduce the Possibility of a Type 2 Error… Clicker Test We’re testing whether or not Robert Pattinson has more fans than the average actor. We reject Ho BUT WE’RE WRONG! What kind of error have we made? A) Type 3 B) Type 1 C) Type 2 D) NO ERROR! RECALL - Errors Test→ Truth ↓ Test Rejects Ho Ho is TRUE Type 1 Error Ho is FALSE Test Does Not Reject Ho Type 2 Error Practical vs Statistical Significance When we reject Ho, we say the result is “Statistically Significant”. This result may have no practical significance. That requires the knowledge of a subject matters expert (i.e. medical doctor, engineer, …) Silly Simile Example In calculating your taxes you find that on average you owe \$204.352 to the government. In fact the value is STATISTICALLY more than \$204.35. Does this hold PRACTICAL significance? Although that example was silly…. Cis vs Hts In some cases a CI and a 2 sided HT can be exchanged and will give the same result. But these times are rare and difficult to define. Hence, don’t exchange them! Pictorially HTs for Proportions Hypotheses: Formula: The Table: A) Normal B) T C) Mahogony Example Market Research Inc wants to know if shoppers are sensitive to the prices of items sold in a supermarket. It obtained a random sample of 802 shoppers and found that 378 shoppers were able to state the correct price of an item immediately after putting it into their cart. Test at the 7% level the hypothesis that at least one-half of all shoppers are able to state the correct price. Clicker Ha is (where A is a number): A) (Ha: p ≠ A) B) (Ha: p > A) C) (Ha: p < A) D) (Ha: p = A) The Hypothesis The Formula Clicker The p value is: A) Pr(D>d) B) Pr(D<d) C) 2Pr(D>|d|) D) Pr(D=d) The P Value Clicker Therefore we: A)Reject Ho B)Accept Ho C)Reject Ha D)Do not reject Ho E)Accept Ha The Conclusion Question If we made a mistake, what kind of mistake would it be…? A) B) C) D) E) Type 1 Type 2 Type 1 and 2 Type 3 We did not make a mistake RECALL - Errors Truth\Tes t Test Rejects Ho Ho is TRUE Type 1 Error Ho is FALSE Test Does Not Reject Ho Type 2 Error 3 Hypothesis Tests and Confidence Intervals HTs Mean (sigma known) Mean (sigma unknown) Proportion CIs Liberals and Conservatives (Democrats vs Republicans) In a local newspaper, The Record, the following article was headlined: “LIBERAL SUPPORT DWINDLING”. The newspaper states: Newspaper Article Liberal support, from a sample of 400 people this week is at 33% plus or minus 4%, 19 times out of 20. This is a drastic drop from last month when support was at 35%. Test this assumption… The “DIAMONDS” Clicker Ha is (where A is a number): A) (Ha: p ≠ A) B) (Ha: p > A) C) (Ha: p < A) D) (Ha: p = A) The Hypothesis The Formula Clicker The p value is: A) Pr(D>d) B) Pr(D<d) C) 2Pr(D>|d|) D) Pr(D=d) The Pvalue Clicker Therefore we: A)Reject Ho B)Accept Ho C)Reject Ha D)Do not reject Ho E)Accept Ha The Conclusion Question If we made a mistake, what kind of mistake would it be…? A) B) C) D) E) Type 1 Type 2 Type 1 and 2 Type 3 We did not make a mistake RECALL - Errors Truth\Tes t Test Rejects Ho Ho is TRUE Type 1 Error Ho is FALSE Test Does Not Reject Ho Type 2 Error Comparing Two Samples What if we want to compare two groups? Males to Females New Drug to Old Drug Salmon and Perch Tech stocks and Financial Stocks Comparing Two Groups How we compare those groups depends on whether or not they are dependent or independent…and the kind of study we are performing… Problem Segway: Variates Response Variate: Explanatory Variate: Focal Explanatory Variate: Example – Students Grades Problem: Compare Female to Male Grades Response: Grade on Midterm Explanatory Variate: Focal Explanatory Variate: Studies 2 types: Experimental Observational Dependent Groups There exists a relationship between individuals in the groups, either real or artificial. Examples: 1. Twins Examples: 2. Same Units Twice 3. Artificial Twins Independent Groups Independent groups are groups that we assume have no relationship between the individuals in the groups. In other words, Study vs Dependent If our study is observational we “match” similar individuals (units) by explanatory variates. If our study is experimental we “block” similar individuals (units) by explanatory variates. In both cases we call this process “pairing”. Example 50 fish are caught from a stream. 20 of them are placed in a low pH solution and 30 are placed in a high pH solution. Is this…? A) Experimental B) Observational Are the groups…? A) Independent B) Dependent Clicking Nomenclature i.e. How good is your memory…? A Farmer wants to test the effectiveness of a new fertilizer. She decides to break her field into 1 acre plots and randomly spread the new fertilizer on 8 of the plots while using the remaining 27 plots to spread the old fertilizer. s Clicking Nomenclature i.e. How good is your memory…? t A h Farmer wants to test the effectiveness of a i new fertilizer. She decides to break her field i snto 1 acre plots and randomly spread the new fertilizer on 8 of the plots while using tthe remaining 27 plots to spread the old f eertilizer. s t c o s Clicking Nomenclature i.e. How good is your memory…? t A h Farmer wants to test the effectiveness of a i new fertilizer. She decides to break her field i snto 1 acre plots and randomly spread the new fertilizer on 8 of the plots while using t s he remaining 27 plots to spread the old tfertilizer. u d y e h e Clicking Nomenclature i.e. How good is your memory…? fA Farmer wants to test the effectiveness of a n o ew fertilizer. She decides to break her field i cnto 1 acre plots and randomly spread the n a ew fertilizer on 8 of the plots while using l the remaining 27 plots to spread the old fertilizer. e x p l a A Very Basic and Very Good Experimental Design Step 1 – Pairing (Blocking/Matching) Step 2 – Randomization Step 3 – Replication (i.e. Repeat 1,2) Explanation of the Next Few Slides Example – Pairing The following shapes/colours are people. Example – Randomization The following shapes/colours are people. Summary of the Blocking Process 1. 2. We pair our units up. We flip a coin and put one unit of the pair in group 1. The other unit goes to group 2. 2 (In)Dependent Groups Picture (with notation) Example - Differencing 18 32 25 19 30 30 2 Dependent Group Differences Group 1 Unit 1 Unit 2 Unit 3 Variance Group 2 Difference Why does this happen???? Consider the fake experiment where we know that the following is exactly true: Weight (lbs) = Height (cms) + Age (years) + Gender Our response varies due to our explanatory variates. Gender is 10 if male, 0 if female. Paired Now suppose we match by Age (A1=A2) and Gender (G1=G2)…Then the difference in weight is only related by HEIGHT. W1 = H1 + A1 + G1 W2 = H2 + A2 + G2 HT: for a mean difference with dependent units 1. Determine the hypotheses 2. Use the formula: 4. 5. Calculate a p value Write a conclusion CI: for a mean difference with dependent units Formula: Example 9 students speed is measured before and after inebriation. The difference (after – before) is determined to be on average 18 km/h with a standard deviation of the differences of 7 km/h. Build a 95% confidence interval for the difference in speed. Formula Conclusion Example 9 students speed is measured before and after inebriation. The difference (after – before) is determined to be on average 18 km/h with a standard deviation of the differences of 7 km/h. Test whether the difference in speeds of an inebriated person with that of a sober individual is different from zero. Clicker The hypothesis is: A) Ha: µ < A B) Ha: µ > A C) Ha: µ ≠ A D) Ha: µ ≤ A E) Ha: µ ≥ A The Hypothesis The Formula Clicker The Pvalue is: A) Pr(D>d) B) Pr(D<d) C) 2Pr(D<d) D) 2Pr(D>|d|) E) 2Pr(D<|d|) The Pvalue Clicker A) B) C) D) E) Reject Ho Do Not Reject Ho Accept Ho Do not reject Ha Reject Ha Conclusion The Prior Example This was an example of a PAIRED (matched or dependent) situation… Sample Example As a researcher for the Ontario Ministry of Environment, you have been asked to determine if Ontario’s air quality index (AQI) has changed in the past 2 years. You select a random sample of 10 cities and find the air quality on the same day in 2 consecutive years. A comparison was made. Are the samples… A) Independent B) Dependent Sample Example A survey found that the average hotel rate in Toronto was \$175.53 and the average rate in Vancouver is \$171.31. Assume that the data were obtained from two samples of 50 hotels each with standard deviation of 9.52 and 10.89 respectively. A comparison was made. Are the samples… A) Independent B) Dependent Independent Notational Picture Independent Samples Picture Independent Samples – CI Formula: Independent Samples – HT 1. Hypotheses: 2. Formula: 3. 4. Pvalue Conclusion NOTE 1. 3. In class I will only cover the case where the standard deviations are assumed to be different. If we have evidence to assume the two are the same a different formula (involving sp) would be needed! Degrees of freedom in this case are: CAREFUL! How you define your Ha affects: 1. P value 2. The calculation of d IF you change the direction of your Ha then everything else must change as well. Your final answer WILL NOT change. Example A researcher hypothesizes that the average number of sports that colleges offer for males is greater than the average number of sports that colleges offer for females. In both cases there were 50 colleges. A sample of the data is given below: Sample Mean (males) = 8.6 Sample Mean (females) = 7.9 Standard deviation (males) = 3.3 Standard deviation (females) = 3.7 Comparison A comparison was made. Are the samples… A) Independent B) Dependent Perform your comparison at a significance level of 10%. Clicker The hypothesis is: A) Ha: µm - µf < A B) Ha: µm - µf > A C) Ha: µm - µf ≠ A D) Ha: µm - µf ≤ A E) Ha: µm - µf ≥ A Hypothesis Formula Clicker The Pvalue is: A) Pr(D>d) B) Pr(D<d) C) 2Pr(D<d) D) 2Pr(D>|d|) E) 2Pr(D<|d|) Pvalue Clicker A) B) C) D) E) Reject Ho Do Not Reject Ho Accept Ho Do not reject Ha Reject Ha Conclusion Example A researcher hypothesizes that the average number of sports that colleges offer for males is greater than the average number of sports that colleges offer for females. In both cases there were 50 colleges. A sample of the data is given below: Sample Mean (males) = 8.6 Sample Mean (females) = 7.9 Standard deviation (males) = 3.3 Standard deviation (females) = 3.7 Build a 90% confidence interval for the difference. Math Conclusion Clicker Question What was your grade (%) on Test 2? A) 90 – 100 B) 85 – 89 C) 70 – 84 D) 55 – 69 E) < 55 Picture for 2 Proportions (Notation) Difference in Proportions – CI Formula: Difference in Proportions - HT Estimated value for p: 1. Hypothesis (ONLY 1): 2. Formula: 3. 4. Pvalue Conclusion Example (Statistics in the Classroom!) The example today will involve using something called a POG. POG is short for “poggendorf”. The Poggendorf a b c The Goal… To visually, without using a straight edge, just the naked eye, guess where a will strike c. Problem…? Are evil or good classes better at seeing through the trick????? Data a b c BEWARE! New Slides Ahead Clicker The hypothesis is: A) Ha: pe - pg < A B) Ha: pe - pg > A C) Ha: pe - pg ≠ A D) Ha: pe - pg ≤ A E) Ha: pe - pg ≥ A Hypothesis Formula Clicker The Pvalue is: A) Pr(D>d) B) Pr(D<d) C) 2Pr(D<d) D) 2Pr(D>|d|) E) 2Pr(D<|d|) Pvalue Clicker A) B) C) D) E) Reject Ho Do Not Reject Ho Accept Ho Do not reject Ha Reject Ha Conclusion Relationships Ch. 12.5 A Measure of Relationship Often we want to look at the relationship between two numbers. For example: Gender and Age at Death Amount of sleep and a midterm grade Lung cancer and smoking Two Measure of Relationship 1. 2. Correlation (and covariance) Slope Correlation/Covariance Notation We use the letter “r” to denote correlation. At times will we write rxy to denote the correlation between x and y. Correlation Interpretation The correlation is a number between -1 and 1. There are 2 important features about this number. a) Magnitude – the size of the number b) Direction – the sign of the number Magnitude The closer |r| is to 1 the stronger the relationship. Graphically this is indicated by a tightness in the data about a line. If |r|=1 we call it a perfectly linear relationship. The closer |r| is to zero, the more randomly scattered and less linear the graph. Correlation of 1 Correlation of 0.9 Correlation of 0.5 Correlation of 0 Direction The sign of r indicates the direction. A positive r indicates that the points have a positive slope. A negative r indicates that the points have a negative slope. Correlation of 0.9 Correlation of -0.9 Clicker Correlation For the graph to the right, the correlation is: A) Large and negative B) Small and positive C) Large and positive D) Small and negative E) None of the Above Other Datasets Linear Relationships Only!! The only relationships studied by correlation is are linear. Correlation cannot study other types, say quadratic (as in the next example). A Quadratic Relationship X=-3,-2,-1,0,1,2,3 Y=9,4,1,0,1,4,9 (i.e. Y=X2) rxy=0 Summary Correlation A number from -1 to 1 indication the strength (maginitude) and direction (positive or negative) of a relationship. Clicker Test 1 The Correlation is A) Positive and weak B) Negative and weak C) Positive and strong D) Negative and strong E) None of the above Clicker Test 2 The Correlation is A) Positive and weak B) Negative and weak C) Positive and strong D) Negative and strong E) None of the above Clicker Test 3 The Correlation is 0 which means A) No relationship B) No linear relationship C) Weak relationship D) None of the above What is a “STRONG” relationship? If r = +.70 or higher Very strong positive relationship +.40 to +.69 Strong positive relationship +.30 to +.39 Moderate positive relationship +.20 to +.29 weak positive relationship +.01 to +.19 No or negligible relationship -.01 to -.19 No or negligible relationship -.20 to -.29 weak negative relationship -.30 to -.39 Moderate negative relationship -.40 to -.69 Strong negative relationship -.70 or higher Very strong negative relationship Covariance (Correlations Useless Cousin) Notation: The Covariance is denoted by sxy. Purpose: Covariance is more useful from a statisticians perspective. We use sxy to calculate r. No Magnitude, Just Direction With a covariance the magnitude is NOT important. It can have values from minus infinity to positive infinity and the size of the number is meaningless. Only the direction can be determined and is based on the sign. Clicker Covariance For the graph to the right, the covariance is: A) large B) positive C) small D) negative E) None of the Above Example: Women Heights and Weights Description: The heights and weights of women aged 30 to 39. Data women height weight 1 58 115 2 59 117 3 60 120 4 61 123 5 62 126 6 63 129 7 64 132 8 65 135 9 66 139 10 67 142 11 68 146 12 69 150 13 70 154 14 71 159 15 72 164 Scatterplot Covariance, Correlation Covariance = 69 Correlation = 0.995 How do we calculate a Covariance…? sxy= WHY??? How do we calculate a Correlation…? rxy= Why? Example Consider the data: X = {1,2,3} Y= {3,2,1} What is the covariance?? What is the correlation?? Covariance Correlation Causation An implication that Y changes due to X. e.g. Smoking Causes Lung Cancer e.g. Lack of sleep causes poor grades. Causation is NOT correlation. The following is a proof by one example… Example Problem: Does smoking cause lung cancer? Plan: The age at death “Agedeath” was recorded for people who smoked a number equal to “cigs”/day and owned a number, “Lighters”, of lighters. The gender was also recorded where 1=male, 0=female. Data AgeDeath Cigs Lighters Gender 1 65 0 1 0 2 42 20 8 1 3 82 0 2 0 4 55 15 6 1 5 60 20 9 0 6 57 0 2 1 7 64 10 5 0 8 78 0 0 1 9 95 0 2 0 10 39 40 8 1 11 52 30 5 0 12 49 25 9 1 Plots Clicker Based on the plots. 1. The correlation between cigs and lighters is: A) Positive B) Negative C) No linear relationship. 2. The correlation between age at death and cigs is: A) Positive B) Negative C) No linear relationship. Correlations AgeDeat Cigs h AgeDeath 1.00 -0.79 Gende r Lighters -0.72 -0.51 Cigs -0.79 1.00 0.83 0.25 Lighters -0.72 0.83 1.00 0.24 Gender -0.51 0.25 0.24 1.00 What do you notice? A) B) C) The correlation between age at death and lighters is: Positive Negative No linear relationship. AgeDeath Cigs Lighter s Gender 1.00 -0.79 -0.72 -0.51 Cigs -0.79 1.00 0.83 0.25 Lighters -0.72 0.83 1.00 0.24 Gender -0.51 0.25 0.24 1.00 AgeDeath So if correlation implies causation then what causes cancer….. Lighter AgeDeath Cigs s Gender 1.00 -0.79 -0.72 -0.51 Cigs -0.79 1.00 0.83 0.25 Lighters -0.72 0.83 1.00 0.24 Gender -0.51 0.25 0.24 1.00 AgeDeath Regression Two Measure of Relationship 1. 2. Correlation (and covariance) Slope BUT how do we get the slope of a set of points…we need a method to build a line in the points. For this we use regression. Regression Line We wish to build a regression line (or line of best fit), which is a line thru our data. As with any line it will have a slope b1 and an intercept, b0. i.e. y=mx+b => y= b0+b1x There are many ways to do that. Consider the data Different Lines One Way to Build a Line is to Use Regression Concept: We want to minimize the vertical distance between our observed points and the line. Picture: Y value Distance: Residual=r=y-y^ y^ Residuals Denote by y(x), a response for explanatory variate x. Denote by y^(x), a response on the line for explanatory variate x. Denote by r=y(x)-y^(x) a residual. The Sum of Residuals are Zero The sum: Hence… We want to minimize the squared residuals, ∑r2. Clicker Question Which line is the one chosen by regression: A) Y=1-2x with residuals (-8,0,8) B) Y=-4+2x with residuals (2, 4, -6) C) Y=-3-x with residuals (-1, -4, 5) D) Y=-1-x with residuals (-6,2,4) Example Problem: To investigate the relationship between heights and weights of women. Data height weight 1 58 115 2 59 117 3 60 120 4 61 123 5 62 126 6 63 129 7 64 132 8 65 135 9 66 139 10 67 142 11 68 146 12 69 150 13 70 154 14 71 159 15 72 164 Graphed The Line of Best Fit (Regression Line) Y=-87.52+3.45x Predictions (Extrapolation) As a notation we’ll use y^ as a prediction. Hence our predictions are on the line. We use the line to help us make our predictions. Hence: y^= -87.52 + 3.45x Example Predict the height of a woman who weighs 200lbs. Formulas y= b0+b1x b0= b1= OR rs y/sx Example: Consider the Data Data X Y 1 3 2 2 3 1 Does a Relationship Exist?? To test whether or not a relationship exists we can perform two tests: 1. We can test to see if r=0 2. We can test to see if b1=0 Both tests give the same results, so we will test r. This means that… If we find a positive correlation, we have a positive slope 2. If we find a negative correlation we have a negative slope AND 3. If we have no correlation, we have no slope. 1. Hypothesis Test for p (rho) 1. Hypothesis: We use the notation p (rho) for a correlation from the population. Hence we ask, is p=0?? Ho: Ha: 2. Formula: d = r√[(n-2)/(1-r2)] This has a t distribution on n-2 degrees of freedom. Pvalue: Same Conclusion: Same Example In one of my courses there are 120 students. The correlation between their midterm mark and their clicker mark is 0.4. Is this significant? Clicker The hypothesis is: A) Ha: p < A B) Ha: p > A C) Ha: p ≠ A D) Ha: p ≤ A E) Ha: p ≥ A Hypothesis Formula Clicker The Pvalue is: A) Pr(D>d) B) Pr(D<d) C) 2Pr(D<d) D) 2Pr(D>|d|) E) 2Pr(D<|d|) Pvalue Clicker A) B) C) D) E) Reject Ho Do Not Reject Ho Accept Ho Do not reject Ha Reject Ha Conclusion What can we say…? Clicker point and hence attending class A) leads to higher marks. B) does not improve marks. C) are positively correlated. D) are not correlated Hence what can we say about the slope…? The slope is A) Negative B) Positive C) Non existant Example The correlation between number of midterm absences and clicker mark were also recorded for 7 students. The correlation was -0.944. Perform a test to see if this is significant. The Hypothesis What should the hypothesis be?? A) Ho: μ=0 Ha: μ≠0 B) Ho: μ ≠ 0 Ha: μ=0 C) Ho: p =0 Ha: p ≠ 0 D) Ho: p ≠ 0 Ha: p=0 The Formula The value for the formula is: A) -6.4 B) 6.4 C) 1.51 D) -1.51 Degrees of Freedom The degrees of freedom are: A) 7 B) 6 C) 5 D) 4 E) None of the above The Pvalue Conclusion Correlation Coefficient vs Coefficient of Determination The correlation coefficient is r. The coefficient of determination is r2. The Coefficient of Determination The coefficient of determination is a measure of how good our model is. In other words, it is a measure of how tight our points are about the line. Value: Interpretation Rule of thumb 0 < r2 < 0.3 => weak 0.3 < r2 < 0.7 => moderate r2 > 0.7 => strong The coefficient of determination tells us… Example Problem: To investigate the relationship between heights and weights of women. Data height weight 1 58 115 2 59 117 3 60 120 4 61 123 5 62 126 6 63 129 7 64 132 8 65 135 9 66 139 10 67 142 11 68 146 12 69 150 13 70 154 14 71 159 15 72 164 Graphed Important Stats Slope: y^= -87.52 + 3.45x Correlation Coefficient: 0.995 Coefficient of Determination: (0.995)2=0.99 Example The speed of a car and the distance it takes to stop are recorded: The data > cars speed dist 1 42 2 4 10 3 74 4 7 22 5 8 16 6 9 10 7 10 18 Important Stats Slope: y^= -17.579 + 3.932x Correlation Coefficient: 0.81 Coefficient of Determination: (0.81)2=0.6561 This means… A) B) C) D) E) The relationship is strong The relationship is positive The amount of variability explained by the model is good All of the above None of the above Causation Causation is NOT association… So how do we prove something causes something else….x causes y? Experimental Studies In an experimental study, we can: 1. Block units together according to all explanatory variates except the focal 2. Randomly determine which member of the pair goes into which group 3. Use more than 1 pair (lots and lots!). We call this replication. 4. Measure the response in each group. If there is a difference in the average response it is due to the focal. But…to prove causation… To prove causation in the last slide we want to apply it to every member of the population. Which is NOT likely… KEYS: Blocking, Randomization and Replication Observational Studies Problems – We may NOT be able to block…nor randomize… Farewell’s Criterion 1. A relationship to be observed in many studies of different types, in different settings 2. A relationship should hold when other plausible variates are controlled (e.g. lighter example) 3. A plausible scientific explanation is required for the direct influence of x on y and no other strong explanations 4. There must be a consistent doseresponse relationship. Good Luck on your Finals! Midterm 2 review Some Review…Covered in TUTORIAL Understandably this stuff is hard. So let’s put it together. Sampling Distributions A single value – A mean – A total – A proportion – Count Data – Which one(s) involve a continuity correction…? A) B) C) D) E) Mean Total Count Proportion C and D The Tricks Decide what the question wants. Often it gives hints like “average”, “total”, … 2. Use the appropriate sampling distribution to standardize Assuming z is positive, 3. If you have Pr(Z<-z) set equal to Pr(Z>z) 4. If you have Pr(Z>z) set equal to 1-Pr(Z<z) 1. Example The number of bags lost at an airport terminal on a particular day is reported to be 15 on average per day with standard deviation 3; find… A) The probability that in 1 day 17 bags are lost. B) The probability that in 5 days 60 bags are lost. B) The probability that in 5 days an average of 10 bags are lost. Confidence Intervals EXPLORATORY! Estimate +/- c(SE) If sigma is known: If sigma is unknown: If we are dealing with a proportion: Clicker In the question that follows is it a CI for: A) mean, sigma known B) mean, sigma unknown C) proportion D) sigma, mean known E) sigma, mean unknown Example The number of bags lost at an airport terminal on a particular day is reported to be 15 on average per day with standard deviation 3; find a 95% confidence interval for the number of lost bags per day. Hypothesis Tests CONFIRMATORY 4 steps: 1. Hypothesis 2. Formula 3. Pvalue 4. Conclusion Hypothesis Tests 4 steps – Mean (sigma known) 1. Hypothesis 2. 3. 4. Formula Pvalue Conclusion Hypothesis Tests 4 steps – Mean (sigma unknown) 1. Hypothesis 2. 3. 4. Formula Pvalue Conclusion Hypothesis Tests 4 steps – Mean (proportion) 1. Hypothesis 2. 3. 4. Formula Pvalue Conclusion Example The number of bags lost at an airport terminal on a particular day is reported to be 15 on average per day with standard deviation 3; use a significance level of 6% to test whether the average per day differs from 14. Final Exam Review Final Exam Review • • Agenda My questions – • I have put them together at the start. Please try them before the help session on your own, without scolling down and with your text book closed. Your questions Example 1 Scientists are curious about CO2 levels and acid rain. 120 areas are measured for CO2 and acidity. It is found that the standard deviation of CO2 and acid levels are 7 and 0.3 respectively. Further, the covariance between CO2 and acidity is 1. a) Interpret the covariance. b) What is the coefficient of determination? c) Test whether the slope of the linear relationship between CO2 and acidity is less than zero. Example 2 Scientists believe that elephants who ended their lives in captivity live shorter lives than those kept in the wild. The length of time the elephant lives is recorded. In each of the following situations, build an appropriate hypthesis test to test the following situations: A. Several groups of elephants are selected. The first group, comprised of 10 elephants who have lived in the wild their entire lives. We compare this group to the second group of 10 elephants who have lived in a zoo their entire lives. A third group of 14 elephants who started their lives in the wild and ended their lives in captivity are compared to a group of 14 elephants who started their lives in captivity and were released to the wild. If w, d and c denote “wild”, “difference” and “captivity” then: sw=4, sc=6, sd=3, xw=68, xc=65, xd=3, Example 2 Scientists believe that elephants who ended their lives in captivity live shorter lives than those kept in the wild. The length of time the elephant lives is recorded. In each of the following situations, build an appropriate hypthesis test to test the following situations: B. 10 zoos were selected at random. From each zoo two elephants from the same litter were selected. One of the elephants was released to the wild while the other was kept in captivity. Another 8 wild areas were selected at random. From each wild area two elephants from a litter were selected. One of the two elephants was captured while the other was allowed to remain in the wild. If w, d and c denote “wild”, “difference” and “captivity” then: sw=4, sc=6, sd=3, xw=68, xc=65, xd=3, Example 3 A student is given 3 choices for courses in the winter term. She will select a math course with 60% probability, a Science course with 35% and the remainder for a Business course. If she takes the business course, the chance she passes is 85%. If she takes the math, the chance she passes is 92%. If she takes the science course, the chance she passes is 73%. A) What is the probability that she passes? B) What is the probability that the course she passed was a science? Example 4 The number of toxins in a politicians blood is determined. (Dalton McGinty had 41 in the last election). The average number of toxins in 400 20 year olds is 24 with a standard deviation of 3 toxins. a) Build a 95% interval for the number of toxins in the typical Canadian? b) Do you believe we have evidence that the number could be zero? Answer using information from part a. c) Assuming these 400 people form a population, estimate the probability that the average number of toxins in 16 people selected at random from these 400 is more than 22. Example 5 The number of homes in default is in decline. In the Canadian population 1/53 homes are in default. If 100 homes are selected at random, what is the approximate probability that less than 2 homes are in default? ANSWERS....Given in Tutorial Please note: 1. If you do not come to the help session, I do not guarantee that the notes/video will be available to you online. 2. I may not have time to cover everything in these slides and have no solutions for the Example 1 Scientists are curious about CO2 levels and acid rain. 120 areas are measured for CO2 and acidity. It is found that the standard deviation of CO2 and acid levels are 7 and 0.3 respectively. Further, the covariance between CO2 and acidity is 1. a) Interpret the covariance. b) What is the coefficient of determination? c) Test whether the slope of the linear relationship between CO2 and acidity is less than zero. Notes Formula: d = r√[(n-2)/(1-r2)] This has a t distribution on n-2 degrees of freedom. Pvalue: Same Conclusion: Same Example 1 Scientists are curious about CO2 levels and acid rain. 120 areas are measured for CO2 and acidity. It is found that the standard deviation of CO2 and acid levels are 7 and 0.3 respectively. Further, the covariance between CO2 and acidity is 1. a) Interpret the covariance. Example 1 Scientists are curious about CO2 levels and acid rain. 120 areas are measured for CO2 and acidity. It is found that the standard deviation of CO2 and acid levels are 7 and 0.3 respectively. Further, the covariance between CO2 and acidity is 1. b) What is the coefficient of determination? Example 1 Scientists are curious about CO2 levels and acid rain. 120 areas are measured for CO2 and acidity. It is found that the standard deviation of CO2 and acid levels are 7 and 0.3 respectively. Further, the covariance between CO2 and acidity is 1. c) Test whether the slope of the linear relationship between CO2 and acidity is less than zero. Example 2 Scientists believe that elephants who ended their lives in captivity live shorter lives than those kept in the wild. The length of time the elephant lives is recorded. In each of the following situations, build an appropriate hypthesis test to test the following situations: A. Several groups of elephants are selected. The first group, comprised of 10 elephants who have lived in the wild their entire lives. We compare this group to the second group of 10 elephants who have lived in a zoo their entire lives. A third group of 14 elephants who started their lives in the wild and ended their lives in captivity are compared to a group of 14 elephants who started their lives in captivity and were released to the wild. If w, d and c denote “wild”, “difference” and “captivity” then: sw=4, sc=6, sd=3, xw=68, xc=65, xd=3, Hypothesis Formula Pvalue Conclusion Example 2 Scientists believe that elephants who ended their lives in captivity live shorter lives than those kept in the wild. The length of time the elephant lives is recorded. In each of the following situations, build an appropriate hypthesis test to test the following situations: B. 10 zoos were selected at random. From each zoo two elephants from the same litter were selected. One of the elephants was released to the wild while the other was kept in captivity. Another 8 wild areas were selected at random. From each wild area two elephants from a litter were selected. One of the two elephants was captured while the other was allowed to remain in the wild. If w, d and c denote “wild”, “difference” and “captivity” then: sw=4, sc=6, sd=3, xw=68, xc=65, xd=3, Hypothesis Formula Pvalue Conclusion Example 3 A student is given 3 choices for courses in the winter term. She will select a math course with 60% probability, a Science course with 35% and the remainder for a Business course. If she takes the business course, the chance she passes is 85%. If she takes the math, the chance she passes is 92%. If she takes the science course, the chance she passes is 73%. A) What is the probability that she passes? B) What is the probability that the course she passed was a science? Example 4 The number of toxins in a politicians blood is determined. (Dalton McGinty had 41 in the last election). The average number of toxins in 400 20 year olds is 24 with a standard deviation of 3 toxins. a) Build a 95% interval for the number of toxins in the typical Canadian? b) Do you believe we have evidence that the number could be zero? Answer using information from part a. c) Assuming these 400 people form a population, estimate the probability that the average number of toxins in 16 people selected at random from these 400 is more than 22. Example 5 The number of homes in default is in decline. In the Canadian population 1/53 homes are in default. If 100 homes are selected at random, what is the approximate probability that less than 2 homes are in default? ...
View Full Document

## This note was uploaded on 02/01/2011 for the course STAT 202 taught by Professor Springer during the Spring '09 term at Waterloo.

Ask a homework question - tutors are online