stats_review_card_triola.pdf - Triola Statistics Series...

Info icon This preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Triola Statistics Series Review Probability Random Variables Statistics: Methods for planning experiments, obtaining data, organizing, summarizing, analyzing, interpreting, and drawing conclusions based on data. Population: Collection of all elements to be studied. Census: Data from every member of a population. Sample: Subcollection of members from a population. Parameter: Numerical measurement of characteristic of a population. Statistic: Numerical measurement of characteristic of a sample. Random Sample: Every member of population has same chance of being selected. Simple Random Sample: Every sample of same size n has the same chance of being selected. Rare Event Rule: If, under a given assumption, the probability of a particular observed event is very small and the observed event occurs significantly less than or significantly greater than what we typically expect with that assumption, conclude that the assumption is probably not correct. Relative Frequency: number of times A occurred P1A2 = number of trials Random Variable: Variable that has a single numerical value, determined by chance, for each outcome. Describing, Exploring, and Comparing Data Measures of Center: Population mean: m gx n Mean from frequency dist.: Sample mean: x = x = g 1 f # x2 n Median: Middle value of data arranged in order. Mode: Most frequent data value(s). maximum + minimum Midrange: 2 Measures of Variation: Range: maximum - minimum Sample standard deviation: s = C g 1x - x2 2 n - 1 or C n1g x2 2 - 1g x22 St. dev. from frequency dist.: s = E n 3 g 1 f # x2 2 4 - n1n - 12 3 g 1 f # x2 4 2 n1n - 12 Sample variance: s2 Population st. dev.: s = C g 1x - m2 2 N Population variance: s2 Distribution: Explore using frequency distribution, histogram, dotplot, stemplot, boxplot. Outlier: Value far away from almost all other values. Time: Consider effects of changes in data over time. (Use time-series graphs, control charts.) Classical Approach: s P1 A2 = (equally likely outcomes) n Probability Distribution: Graph, table, or formula that gives the probability for each value of the random variable. Requirements of random variable: 1. g P1x2 = 1 P1A2 = 1 - P1A2 Addition Rule: Disjoint Events: Cannot occur together. If A, B are disjoint: P1A or B2 = P1A2 + P1B2 If A, B are not disjoint: P1A or B2 = P1A2 + P1B2 - P1A and B2 Multiplication Rule: Independent Events: No event affects probability of other event. If A, B are independent: P1A and B2 = P1A2 # P1B2 If A, B are dependent: P1A and B2 = P1A2 # P1B 0 A2 where P1B 0 A2 is P1B2 assuming that event A has already occurred. m = g 3 x # P1x24 s normal approximation to Binomial: Requires np Ú 5 and nq Ú 5. Use m = np and s = 1npq. s = 2 g 3x2 # P1x24 - m2 Expected value: E = g 3 x # P1x24 Determining Sample Size Binomial Distribution: Requires fixed number of independent trials with all outcomes in two categories, and constant probability. n: x: p: q: Fixed number of trials Number of successes in n trials Probability of success in one trial Probability of failure in one trial P1x2: Probability of x successes in n trials n! # px # qn-x 1n - x2!x! P1x2 = m = np s = 1npq P1x2 = mx # e-m x! Proportion: 3 za>2 4 2 # 0.25 n = n = Mean: n = c E 3 za>2 4 2 pnqn E za>2s E 2 d 2 St. dev. (binomial) Proportion: pn - E 6 p 6 pn + E ISBN-13: 978-0-13-446509-8 ISBN-10: 0-13-446509-1 x where E = za>2 and pn = B n n Mean: x - E 6 m 6 x + E E = ta>2 # 9 0 0 0 0 C 1n - 12s2 x2R pn1qn1 C n1 triola_stats_stdy.indd 1 1 n2 1s not known2 6 s 6 C 1 x1 - x2 2 - E 6 1m1 - u2 2 6 1 x1 - x2 2 + E where E = ta>2 s21 C n1 + s22 n2 df = smaller of n1 - 1 and n2 - 1. alternative Cases for two Independent Means: If s1, s2 unknown but assumed equal, use pooled variance s2p: E = ta>2 where s2p = s2p C n1 1n1 - + s2p n2 12s21 + 1n2 - 12s22 1n1 - 12 + 1n2 - 12 s21 s22 + C n1 n2 Matched Pairs: d - E 6 md 6 d + E sd where E = ta>2 and df = n - 1 1n 1n - 12s2 x2L Hypothesis test: Procedure for testing claim about a population characteristic. null Hypothesis H0: Statement that value of population parameter is equal to some claimed value. alternative Hypothesis H1: Statement that population parameter has a value that somehow differs from value in the null hypothesis. Critical region: All values of test statistic leading to rejection of null hypothesis. A ∙ Significance level: Probability that test statistic falls in critical region, assuming null hypothesis is true. type I error: Rejecting null hypothesis when it is true. Probability of type I error is significance level a. type II error: Failing to reject null hypothesis when it is false. Probability of type II error is denoted by b. Power of test: Probability of rejecting a false null hypothesis. 9 780134 465098 n! 1n - r2!r! triola_stats_stdy_reprint.indd 2 triola_stats_stdy_reprint.indd 1 pn2qn2 + 0.90 0.95 0.99 • s unknown and normally distributed population: use t • s unknown and n 7 30: use t 1.645 1.96 2.575 • s known and normally distributed population: use z • s known and n 7 30: use z HyPOtHeSIS teSt: rIgHt-taIleD Significance level A Critical Value 0.05 0.025 0.01 0.005 1.645 1.96 2.33 2.575 0.05 0.025 0.01 0.005 0.05 0.01 0.10 07/12/16 11:26 AM Critical Value −1.645 −1.96 −2.33 −2.575 Critical Value z = If none of the above apply, use nonparametric method or bootstrapping. yes no “There is not suffi­ cient evidence to warrant rejection of the claim that … [original claim].” “There is not sufficient evidence to support the claim that … [original claim].” “There is sufficient evidence to warrant rejection of the claim that … [original claim].” 1 pn1 - pn2 2 - 1 p1 - p2 2 pq where p = “There is sufficient evidence to support the claim that … [original claim].” C n1 x1 + x2 + pq n2 and q = 1 - p n1 + n2 x1 x2 and pn1 = and pn2 = n1 n2 Two Means (independent samples): Requires two independent simple random samples with both populations normally distributed or n1 7 30 and n2 7 30. The population standard deviations s1 and s2 are usually unknown. Recommendation: do not assume that s1 = s2. Test statistic (unknown s1 and s2, and not assuming s1 = s2): t = 1x1 - x2 2 - 1m1 - m2 2 s21 C n1 ±1.96 ±2.575 ±1.645 + s22 n2 df = smaller of n1 - 1 and n2 - 1 Hypothesis Testing (One Sample) One Proportion: Requires simple random sample, np Ú 5 and nq Ú 5, and conditions for binomial distribution. Critical Value Method of testing Hypotheses: Uses decision criterion of rejecting null hypothesis only if test statistic falls within critical region bounded by critical value. Critical Value: Any value separating critical region from values of test statistic that do not lead to rejection of null hypothesis. P-value Method of testing Hypotheses: Uses decision criterion of rejecting null hypothesis only if P-value … a (where a = significance level). P-value: Probability of getting value of test statistic at least as extreme as the one found from sample data, assuming that null hypothesis is true. Left-Tailed Test: P-value = area to left of test statistic Right-Tailed Test: P-value = area to right of test statistic Two-Tailed Test: P-value = twice the area in tail beyond test statistic Test statistic: z = pn - p pq where pn = x n B n One Mean: Requires simple random sample and either n 7 30 or normally distributed population. Hypothesis Testing (Alternative Cases for Two Means with Independent Samples) Requires two independent simple random samples and either of these two conditions: Both populations normally distributed or n1 7 30 and n2 7 30. Alternative case when S1 and S2 are not known, but it is assumed that S1 = S2: Pool variances and use test statistic Test statistic t = x - mx s 1n 1for s not known2 where df = n - 1 One Standard Dev. or Variance: Requires simple random sample and normally distributed population. Test statistic: x2 = where df = n - 1 1n - 12s2 s2 t = 1x1 - x2 2 - 1m1 - m2 2 s2p C n1 where s2p = + s2p n2 1n1 - 12s21 + 1n2 - 12s22 1n1 - 12 + 1n2 - 12 and df = n1 + n2 - 2 Alternative case when S1 and S2 are both known values: z = 07/12/16 11:28 AM 2 Two Proportions: Requires two independent simple random samples and np Ú 5 and nq Ú 5 for each. Test statistic: Does original claim include equality? HyPOtHeSIS teSt: twO-taIleD Significance level A Hypothesis Testing (Two Proportions or Two Independent Means) wording of Conclusion HyPOtHeSIS teSt: left-taIleD Significance level A Choosing Between t and z for Inferences about Mean Critical Value 1. Identify original claim, then state null hypothesis (with equality) and alternative hypothesis (without equality). 2. Select significance level a. 3. Evaluate test statistic. 4. Proceed with critical value method or P-value method: Combinations (order doesn’t count) of r items selected from n different items: = where E = za>2 Procedure Permutations when some items are identical to others: nCr s 1n 1 pn1 - pn2 2 - E 6 1 p1 - p2 2 6 1 pn1 - pn2 2 + E Confidence level Hypothesis testing pn qn Standard Deviation: n! nPr = 1n - r2! n! n1!n2! c nk! 1 pn and qn known2 Confidence Intervals (Using One Sample) where e ≈ 2.71828 COnfIDenCe InterVal two Proportions: Known s1 and s2: E = za>2 2 Mean (binomial) Poisson Distribution: Discrete probability distribution that applies to occurrences of some event over a specified interval. Counting Multiplication Counting Rule: If an event can occur m ways and a second event can occur n ways, together they can occur m # n ways. Factorial Rule: n different items can be arranged n! different ways. Permutations (order counts) of r items selected from n different items: x - mx 1n Parameters of random variable: Probability property: 0 … P1A2 … 1 Complement of Event A: As sample size increases, sample means x approach normal distribution; s m x = m and sx = 1n 2. 0 … P1x2 … 1 Common Critical z Values two Means (Independent): Central limit theorem: so that z = Confidence Intervals (Using two Samples) Fail to reject H0 Basic Terms Continuous random variable having bell-shaped and symmetric graph and defined by specific equation. Standard normal Distribution: Normal distribution with m = 0 and s = 1. x - m Standard z score: z = s Reject H0 normal Distribution 1x1 - x2 2 - 1m1 - m2 2 s21 s22 + C n1 n2 triola_stats_stdy_reprint.indd 3 Matched Pairs Requires simple random samples of matched pairs and either the number of matched pairs is n 7 30 or the pairs have differences from a population with a distribution that is approximately normal. d: Individual difference between values in a single matched pair md: Population mean difference for all matched pairs d: Mean of all sample differences d sd: Standard deviation of all sample differences d n: Number of pairs of data d - md Test statistic: t = where df = n - 1 sd 1n Hypothesis Testing (Two Variances or Two Standard Deviations) Requires independent simple random samples from populations with normal distributions. s21: n1: s21: larger of the two sample variances size of the sample with the larger variance variance of the population with the larger sample variance Test statistic: F = s21 s22 where s21 is the larger of the two sample variances and numerator df = n1 - 1 and denominator df = n2 - 1 Correlation Scatterplot: Graph of paired (x, y) sample data. linear Correlation Coefficient r : Measures strength of linear association between the two variables. Property of r : - 1 … r … 1 Correlation Requirements: Bivariate normal distribution (for any fixed value of x, the values of y are normally distributed, and for any fixed value of y, the values of x are normally distributed). linear Correlation Coefficient: r = n g xy - 1g x21g y2 2n1g x2 2 - 1g x22 2n1g y 2 2 - 1g y22 or r = g 1zx zy 2 n - 1 2 Explained Variation: r is the proportion of the variation in y that is explained by the linear association between x and y. Hypothesis test 1. Using r as test statistic: If 0 r 0 Ú critical value (from table), then there is sufficient evidence to support a claim of linear correlation. If 0 r 0 6 critical value, there is not sufficient evidence to support a claim of linear correlation. 2. Using t as test statistic: r t = with df = n - 2 1 - r2 Cn - 2 26/11/16 1:02 PM 3 07/12/16 11:30 AM Triola Statistics Series Review Probability Random Variables Statistics: Methods for planning experiments, obtaining data, organizing, summarizing, analyzing, interpreting, and drawing conclusions based on data. Population: Collection of all elements to be studied. Census: Data from every member of a population. Sample: Subcollection of members from a population. Parameter: Numerical measurement of characteristic of a population. Statistic: Numerical measurement of characteristic of a sample. Random Sample: Every member of population has same chance of being selected. Simple Random Sample: Every sample of same size n has the same chance of being selected. Rare Event Rule: If, under a given assumption, the probability of a particular observed event is very small and the observed event occurs significantly less than or significantly greater than what we typically expect with that assumption, conclude that the assumption is probably not correct. Relative Frequency: number of times A occurred P1A2 = number of trials Random Variable: Variable that has a single numerical value, determined by chance, for each outcome. Describing, Exploring, and Comparing Data Measures of Center: Population mean: m gx n Mean from frequency dist.: Sample mean: x = x = g 1 f # x2 n Median: Middle value of data arranged in order. Mode: Most frequent data value(s). maximum + minimum Midrange: 2 Measures of Variation: Range: maximum - minimum Sample standard deviation: s = C g 1x - x2 2 n - 1 or C n1g x2 2 - 1g x22 St. dev. from frequency dist.: s = E n 3 g 1 f # x2 2 4 - n1n - 12 3 g 1 f # x2 4 2 n1n - 12 Sample variance: s2 Population st. dev.: s = C g 1x - m2 2 N Population variance: s2 Distribution: Explore using frequency distribution, histogram, dotplot, stemplot, boxplot. Outlier: Value far away from almost all other values. Time: Consider effects of changes in data over time. (Use time-series graphs, control charts.) Classical Approach: s P1 A2 = (equally likely outcomes) n Probability Distribution: Graph, table, or formula that gives the probability for each value of the random variable. Requirements of random variable: 1. g P1x2 = 1 P1A2 = 1 - P1A2 Addition Rule: Disjoint Events: Cannot occur together. If A, B are disjoint: P1A or B2 = P1A2 + P1B2 If A, B are not disjoint: P1A or B2 = P1A2 + P1B2 - P1A and B2 Multiplication Rule: Independent Events: No event affects probability of other event. If A, B are independent: P1A and B2 = P1A2 # P1B2 If A, B are dependent: P1A and B2 = P1A2 # P1B 0 A2 where P1B 0 A2 is P1B2 assuming that event A has already occurred. m = g 3 x # P1x24 s normal approximation to Binomial: Requires np Ú 5 and nq Ú 5. Use m = np and s = 1npq. s = 2 g 3x2 # P1x24 - m2 Expected value: E = g 3 x # P1x24 Determining Sample Size Binomial Distribution: Requires fixed number of independent trials with all outcomes in two categories, and constant probability. n: x: p: q: Fixed number of trials Number of successes in n trials Probability of success in one trial Probability of failure in one trial P1x2: Probability of x successes in n trials n! # px # qn-x 1n - x2!x! P1x2 = m = np s = 1npq P1x2 = mx # e-m x! Proportion: 3 za>2 4 2 # 0.25 n = n = Mean: n = c E 3 za>2 4 2 pnqn E za>2s E 2 d 2 St. dev. (binomial) Proportion: pn - E 6 p 6 pn + E ISBN-13: 978-0-13-446509-8 ISBN-10: 0-13-446509-1 x where E = za>2 and pn = B n n Mean: x - E 6 m 6 x + E E = ta>2 # 9 0 0 0 0 C 1n - 12s2 x2R pn1qn1 C n1 triola_stats_stdy.indd 1 1 n2 1s not known2 6 s 6 C 1 x1 - x2 2 - E 6 1m1 - u2 2 6 1 x1 - x2 2 + E where E = ta>2 s21 C n1 + s22 n2 df = smaller of n1 - 1 and n2 - 1. alternative Cases for two Independent Means: If s1, s2 unknown but assumed equal, use pooled variance s2p: E = ta>2 where s2p = s2p C n1 1n1 - + s2p n2 12s21 + 1n2 - 12s22 1n1 - 12 + 1n2 - 12 s21 s22 + C n1 n2 Matched Pairs: d - E 6 md 6 d + E sd where E = ta>2 and df = n - 1 1n 1n - 12s2 x2L Hypothesis test: Procedure for testing claim about a population characteristic. null Hypothesis H0: Statement that value of population parameter is equal to some claimed value. alternative Hypothesis H1: Statement that population parameter has a value that somehow differs from value in the null hypothesis. Critical region: All values of test statistic leading to rejection of null hypothesis. A ∙ Significance level: Probability that test statistic falls in critical region, assuming null hypothesis is true. type I error: Rejecting null hypothesis when it is true. Probability of type I error is significance level a. type II error: Failing to reject null hypothesis when it is false. Probability of type II error is denoted by b. Power of test: Probability of rejecting a false null hypothesis. 9 780134 465098 n! 1n - r2!r! triola_stats_stdy_reprint.indd 2 triola_stats_stdy_reprint.indd 1 pn2qn2 + 0.90 0.95 0.99 • s unknown and normally distributed population: use t • s unknown and n 7 30: use t 1.645 1.96 2.575 • s known and normally distributed population: use z • s known and n 7 30: use z HyPOtHeSIS teSt: rIgHt-taIleD Significance level A Critical Value 0.05 0.025 0.01 0.005 1.645 1.96 2.33 2.575 0.05 0.025 0.01 0.005 0.05 0.01 0.10 07/12/16 11:26 AM Critical Value −1.645 −1.96 −2.33 −2.575 Critical Value z = If none of the above apply, use nonparametric method or bootstrapping. yes no “There is not suffi­ cient evidence to warrant rejection of the claim that … [original claim].” “There is not sufficient evidence to support the claim that … [original claim].” “There is sufficient evidence to warrant rejection of the claim that … [original claim].” 1 pn1 - pn2 2 - 1 p1 - p2 2 pq where p = “There is sufficient evidence to support the claim that … [original claim].” C n1 x1 + x2 + pq n2 and q = 1 - p n1 + n2 x1 x2 and pn1 = and pn2 = n1 n2 Two Means (independent samples): Requires two independent simple random samples with both populations normally distributed or n1 7 30 and n2 7 30. The population standard deviations s1 and s2 are usually unknown. Recommendation: do not assume that s1 = s2. Test statistic (unknown s1 and s2, and not assuming s1 = s2): t = 1x1 - x2 2 - 1m1 - m2 2 s21 C n1 ±1.96 ±2.575 ±1.645 + s22 n2 df = smaller of n1 - 1 and n2 - 1 Hypothesis Testing (One Sample) One Proportion: Requires simple random sample, np Ú 5 and nq Ú 5, and conditions for binomial distribution. Critical Value Method of testing Hypotheses: Uses decision criterion of rejecting null hypothesis only if test statistic falls within critical region bounded by critical value. Critical Value: Any value separating critical region from values of test statistic that do not lead to rejection of null hypothesis. P-value Method of testing Hypotheses: Uses decision criterion of rejecting null hypothesis only if P-value … a (where a = significance level). P-value: Probability of getting value of test statistic at least as extreme as the one found from sample data, assuming that null hypothesis is true. Left-Tailed Test: P-value = area to left of test statistic Right-Tailed Test: P-value = area to right of test statistic Two-Tailed Test: P-value = twice the area in tail beyond test statistic Test statistic: z = pn - p pq where pn = x n B n One Mean: Requires simple random sample and either n 7 30 or normally distributed population. Hypothesis Testing (Alternative Cases for Two Means with Independent Samples) Requires two independent simple random samples and either of these two conditions: Both populations normally distributed or n1 7 30 and n2 7 30. Alternative case when S1 and S2 are not known, but it is assumed that S1 = S2: Pool variances and use test statistic Test statistic t = x - mx s 1n 1for s not known2 where df = n - 1 One Standard Dev. or Variance: Requires simple random sample and normally distributed population. Test statistic: x2 = wher...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern