**Unformatted text preview: **Triola Statistics Series Review
Probability Random Variables Statistics: Methods for planning experiments,
obtaining data, organizing, summarizing, analyzing,
interpreting, and drawing conclusions based on data.
Population: Collection of all elements to be studied.
Census: Data from every member of a population.
Sample: Subcollection of members from a
population.
Parameter: Numerical measurement of
characteristic of a population.
Statistic: Numerical measurement of characteristic
of a sample.
Random Sample: Every member of population
has same chance of being selected.
Simple Random Sample: Every sample of same
size n has the same chance of being selected. Rare Event Rule: If, under a given assumption,
the probability of a particular observed event is very
small and the observed event occurs significantly
less than or significantly greater than what we typically expect with that assumption, conclude that the
assumption is probably not correct.
Relative Frequency:
number of times A occurred
P1A2 =
number of trials Random Variable: Variable that has a single
numerical value, determined by chance, for each
outcome. Describing, Exploring,
and Comparing Data
Measures of Center:
Population mean: m gx
n
Mean from frequency dist.:
Sample mean: x = x = g 1 f # x2 n
Median: Middle value of data arranged in order.
Mode: Most frequent data value(s).
maximum + minimum
Midrange:
2
Measures of Variation:
Range: maximum - minimum
Sample standard deviation:
s = C g 1x - x2 2
n - 1 or C n1g x2 2 - 1g x22 St. dev. from frequency dist.:
s = E n 3 g 1 f # x2 2 4 - n1n - 12 3 g 1 f # x2 4 2 n1n - 12 Sample variance: s2
Population st. dev.: s = C g 1x - m2 2
N Population variance: s2
Distribution: Explore using frequency distribution,
histogram, dotplot, stemplot, boxplot.
Outlier: Value far away from almost all other values.
Time: Consider effects of changes in data over time.
(Use time-series graphs, control charts.) Classical Approach:
s
P1 A2 = (equally likely outcomes)
n Probability Distribution: Graph, table, or
formula that gives the probability for each value of
the random variable.
Requirements of random variable:
1. g P1x2 = 1 P1A2 = 1 - P1A2
Addition Rule:
Disjoint Events: Cannot occur together.
If A, B are disjoint:
P1A or B2 = P1A2 + P1B2
If A, B are not disjoint:
P1A or B2 = P1A2 + P1B2 - P1A and B2
Multiplication Rule:
Independent Events: No event affects probability
of other event.
If A, B are independent:
P1A and B2 = P1A2 # P1B2
If A, B are dependent:
P1A and B2 = P1A2 # P1B 0 A2 where P1B 0 A2 is P1B2 assuming that event A has
already occurred. m = g 3 x # P1x24 s normal approximation to Binomial:
Requires np Ú 5 and nq Ú 5. Use m = np and
s = 1npq. s = 2 g 3x2 # P1x24 - m2 Expected value: E = g 3 x # P1x24 Determining Sample Size Binomial Distribution: Requires fixed number of
independent trials with all outcomes in two categories, and constant probability.
n:
x:
p:
q: Fixed number of trials
Number of successes in n trials
Probability of success in one trial
Probability of failure in one trial P1x2: Probability of x successes in n trials
n!
# px # qn-x
1n - x2!x! P1x2 =
m = np
s = 1npq P1x2 = mx # e-m
x! Proportion: 3 za>2 4 2 # 0.25 n =
n =
Mean: n = c E 3 za>2 4 2 pnqn
E za>2s
E 2 d 2 St. dev. (binomial) Proportion: pn - E 6 p 6 pn + E ISBN-13: 978-0-13-446509-8
ISBN-10:
0-13-446509-1 x
where E = za>2
and pn =
B n
n
Mean: x - E 6 m 6 x + E
E = ta>2 # 9 0 0 0 0 C 1n - 12s2
x2R pn1qn1 C n1 triola_stats_stdy.indd 1 1 n2 1s not known2 6 s 6 C 1 x1 - x2 2 - E 6 1m1 - u2 2 6 1 x1 - x2 2 + E
where E = ta>2 s21 C n1 + s22 n2 df = smaller of n1 - 1 and n2 - 1.
alternative Cases for two Independent
Means:
If s1, s2 unknown but assumed equal, use pooled
variance s2p:
E = ta>2
where
s2p = s2p C n1 1n1 - + s2p
n2 12s21 + 1n2 - 12s22 1n1 - 12 + 1n2 - 12
s21
s22
+
C n1
n2 Matched Pairs:
d - E 6 md 6 d + E
sd
where E = ta>2
and df = n - 1
1n 1n - 12s2
x2L Hypothesis test: Procedure for testing claim about
a population characteristic.
null Hypothesis H0: Statement that value of population parameter is equal to some claimed value.
alternative Hypothesis H1: Statement that population parameter has a value that somehow differs from
value in the null hypothesis.
Critical region: All values of test statistic leading to
rejection of null hypothesis.
A ∙ Significance level: Probability that test statistic
falls in critical region, assuming null hypothesis is true.
type I error: Rejecting null hypothesis when it is
true. Probability of type I error is significance level a.
type II error: Failing to reject null hypothesis when
it is false. Probability of type II error is denoted by b.
Power of test: Probability of rejecting a false null
hypothesis. 9 780134 465098 n!
1n - r2!r! triola_stats_stdy_reprint.indd 2 triola_stats_stdy_reprint.indd 1 pn2qn2 + 0.90
0.95
0.99 • s unknown and normally distributed
population: use t
• s unknown and n 7 30: use t 1.645
1.96
2.575 • s known and normally distributed population:
use z
• s known and n 7 30: use z HyPOtHeSIS teSt: rIgHt-taIleD
Significance
level A Critical
Value 0.05
0.025
0.01
0.005 1.645
1.96
2.33
2.575 0.05
0.025
0.01
0.005 0.05
0.01
0.10 07/12/16 11:26 AM Critical
Value
−1.645
−1.96
−2.33
−2.575 Critical
Value z = If none of the above apply, use nonparametric
method or bootstrapping. yes no “There is not suffi
cient evidence to
warrant rejection
of the claim that …
[original claim].” “There is not sufficient
evidence to support
the claim that …
[original claim].” “There is sufficient
evidence to warrant
rejection of the claim
that … [original
claim].” 1 pn1 - pn2 2 - 1 p1 - p2 2
pq where p = “There is sufficient
evidence to support
the claim that …
[original claim].” C n1 x1 + x2 + pq
n2 and q = 1 - p n1 + n2
x1
x2
and pn1 =
and pn2 =
n1
n2 Two Means (independent samples):
Requires two independent simple random samples
with both populations normally distributed or
n1 7 30 and n2 7 30.
The population standard deviations s1 and s2 are
usually unknown.
Recommendation: do not assume that s1 = s2.
Test statistic (unknown s1 and s2, and not assuming
s1 = s2):
t = 1x1 - x2 2 - 1m1 - m2 2
s21 C n1 ±1.96
±2.575
±1.645 + s22 n2 df = smaller of n1 - 1 and n2 - 1 Hypothesis Testing (One Sample)
One Proportion: Requires simple random sample, np Ú 5 and nq Ú 5, and conditions for binomial
distribution. Critical Value Method of testing Hypotheses:
Uses decision criterion of rejecting null hypothesis only
if test statistic falls within critical region bounded by
critical value.
Critical Value: Any value separating critical region
from values of test statistic that do not lead to rejection
of null hypothesis.
P-value Method of testing Hypotheses: Uses
decision criterion of rejecting null hypothesis only if
P-value … a (where a = significance level).
P-value: Probability of getting value of test statistic
at least as extreme as the one found from sample data,
assuming that null hypothesis is true.
Left-Tailed Test: P-value = area to left of test statistic
Right-Tailed Test: P-value = area to right of test
statistic
Two-Tailed Test: P-value = twice the area in tail
beyond test statistic Test statistic: z = pn - p
pq where pn = x
n B n One Mean: Requires simple random sample and
either n 7 30 or normally distributed population. Hypothesis Testing (Alternative
Cases for Two Means with
Independent Samples)
Requires two independent simple random samples
and either of these two conditions:
Both populations normally distributed or n1 7 30
and n2 7 30.
Alternative case when S1 and S2 are not
known, but it is assumed that S1 = S2:
Pool variances and use test statistic Test statistic
t = x - mx
s
1n 1for s not known2 where df = n - 1
One Standard Dev. or Variance: Requires simple
random sample and normally distributed population.
Test statistic: x2 =
where df = n - 1 1n - 12s2
s2 t = 1x1 - x2 2 - 1m1 - m2 2
s2p C n1 where
s2p = + s2p n2 1n1 - 12s21 + 1n2 - 12s22
1n1 - 12 + 1n2 - 12 and df = n1 + n2 - 2 Alternative case when S1 and S2 are both
known values:
z = 07/12/16 11:28 AM 2 Two Proportions: Requires two independent simple random samples and np Ú 5 and nq Ú 5 for each.
Test statistic: Does original claim include equality? HyPOtHeSIS teSt: twO-taIleD
Significance
level A Hypothesis Testing
(Two Proportions or Two
Independent Means) wording of Conclusion HyPOtHeSIS teSt: left-taIleD
Significance
level A Choosing Between t and z for
Inferences about Mean Critical
Value 1. Identify original claim, then state null hypothesis
(with equality) and alternative hypothesis (without equality).
2. Select significance level a.
3. Evaluate test statistic.
4. Proceed with critical value method or P-value
method: Combinations (order doesn’t count) of r items
selected from n different items:
= where E = za>2 Procedure Permutations when some items are identical to others: nCr s
1n 1 pn1 - pn2 2 - E 6 1 p1 - p2 2 6 1 pn1 - pn2 2 + E Confidence
level Hypothesis testing pn qn Standard Deviation: n!
nPr =
1n - r2!
n!
n1!n2! c nk! 1 pn and qn known2 Confidence Intervals
(Using One Sample) where e ≈ 2.71828 COnfIDenCe InterVal two Proportions: Known s1 and s2: E = za>2 2 Mean (binomial) Poisson Distribution: Discrete probability distribution that applies to occurrences of some event over
a specified interval. Counting
Multiplication Counting Rule: If an event can
occur m ways and a second event can occur n ways,
together they can occur m # n ways.
Factorial Rule: n different items can be arranged
n! different ways.
Permutations (order counts) of r items selected
from n different items: x - mx
1n Parameters of random variable: Probability property: 0 … P1A2 … 1
Complement of Event A: As sample size increases, sample means x approach
normal distribution;
s
m x = m and sx =
1n 2. 0 … P1x2 … 1 Common Critical z Values two Means (Independent): Central limit theorem: so that z = Confidence Intervals
(Using two Samples) Fail to reject H0 Basic Terms Continuous random variable having bell-shaped and
symmetric graph and defined by specific equation.
Standard normal Distribution:
Normal distribution with
m = 0 and s = 1.
x - m
Standard z score: z =
s Reject H0 normal Distribution 1x1 - x2 2 - 1m1 - m2 2
s21
s22
+
C n1
n2 triola_stats_stdy_reprint.indd 3 Matched Pairs
Requires simple random samples of matched pairs
and either the number of matched pairs is n 7 30 or
the pairs have differences from a population with a
distribution that is approximately normal.
d: Individual difference between values in a
single matched pair
md: Population mean difference for all matched pairs
d: Mean of all sample differences d
sd: Standard deviation of all sample differences d
n: Number of pairs of data
d - md
Test statistic: t =
where df = n - 1
sd
1n Hypothesis Testing (Two Variances
or Two Standard Deviations)
Requires independent simple random samples from
populations with normal distributions.
s21:
n1:
s21: larger of the two sample variances
size of the sample with the larger variance
variance of the population with the larger
sample variance Test statistic: F = s21
s22 where s21 is the larger of the two sample variances and numerator df = n1 - 1
and denominator df = n2 - 1 Correlation
Scatterplot: Graph of paired (x, y) sample data.
linear Correlation Coefficient r : Measures
strength of linear association between the two variables.
Property of r : - 1 … r … 1
Correlation Requirements: Bivariate normal
distribution (for any fixed value of x, the values of y
are normally distributed, and for any fixed value of y,
the values of x are normally distributed).
linear Correlation Coefficient:
r = n g xy - 1g x21g y2 2n1g x2 2 - 1g x22 2n1g y 2 2 - 1g y22
or r = g 1zx zy 2
n - 1 2 Explained Variation: r is the proportion of the
variation in y that is explained by the linear association between x and y.
Hypothesis test
1. Using r as test statistic: If 0 r 0 Ú critical value
(from table), then there is sufficient evidence
to support a claim of linear correlation.
If 0 r 0 6 critical value, there is not sufficient
evidence to support a claim of linear correlation.
2. Using t as test statistic:
r
t =
with df = n - 2
1 - r2
Cn - 2 26/11/16 1:02 PM 3 07/12/16 11:30 AM Triola Statistics Series Review
Probability Random Variables Statistics: Methods for planning experiments,
obtaining data, organizing, summarizing, analyzing,
interpreting, and drawing conclusions based on data.
Population: Collection of all elements to be studied.
Census: Data from every member of a population.
Sample: Subcollection of members from a
population.
Parameter: Numerical measurement of
characteristic of a population.
Statistic: Numerical measurement of characteristic
of a sample.
Random Sample: Every member of population
has same chance of being selected.
Simple Random Sample: Every sample of same
size n has the same chance of being selected. Rare Event Rule: If, under a given assumption,
the probability of a particular observed event is very
small and the observed event occurs significantly
less than or significantly greater than what we typically expect with that assumption, conclude that the
assumption is probably not correct.
Relative Frequency:
number of times A occurred
P1A2 =
number of trials Random Variable: Variable that has a single
numerical value, determined by chance, for each
outcome. Describing, Exploring,
and Comparing Data
Measures of Center:
Population mean: m gx
n
Mean from frequency dist.:
Sample mean: x = x = g 1 f # x2 n
Median: Middle value of data arranged in order.
Mode: Most frequent data value(s).
maximum + minimum
Midrange:
2
Measures of Variation:
Range: maximum - minimum
Sample standard deviation:
s = C g 1x - x2 2
n - 1 or C n1g x2 2 - 1g x22 St. dev. from frequency dist.:
s = E n 3 g 1 f # x2 2 4 - n1n - 12 3 g 1 f # x2 4 2 n1n - 12 Sample variance: s2
Population st. dev.: s = C g 1x - m2 2
N Population variance: s2
Distribution: Explore using frequency distribution,
histogram, dotplot, stemplot, boxplot.
Outlier: Value far away from almost all other values.
Time: Consider effects of changes in data over time.
(Use time-series graphs, control charts.) Classical Approach:
s
P1 A2 = (equally likely outcomes)
n Probability Distribution: Graph, table, or
formula that gives the probability for each value of
the random variable.
Requirements of random variable:
1. g P1x2 = 1 P1A2 = 1 - P1A2
Addition Rule:
Disjoint Events: Cannot occur together.
If A, B are disjoint:
P1A or B2 = P1A2 + P1B2
If A, B are not disjoint:
P1A or B2 = P1A2 + P1B2 - P1A and B2
Multiplication Rule:
Independent Events: No event affects probability
of other event.
If A, B are independent:
P1A and B2 = P1A2 # P1B2
If A, B are dependent:
P1A and B2 = P1A2 # P1B 0 A2 where P1B 0 A2 is P1B2 assuming that event A has
already occurred. m = g 3 x # P1x24 s normal approximation to Binomial:
Requires np Ú 5 and nq Ú 5. Use m = np and
s = 1npq. s = 2 g 3x2 # P1x24 - m2 Expected value: E = g 3 x # P1x24 Determining Sample Size Binomial Distribution: Requires fixed number of
independent trials with all outcomes in two categories, and constant probability.
n:
x:
p:
q: Fixed number of trials
Number of successes in n trials
Probability of success in one trial
Probability of failure in one trial P1x2: Probability of x successes in n trials
n!
# px # qn-x
1n - x2!x! P1x2 =
m = np
s = 1npq P1x2 = mx # e-m
x! Proportion: 3 za>2 4 2 # 0.25 n =
n =
Mean: n = c E 3 za>2 4 2 pnqn
E za>2s
E 2 d 2 St. dev. (binomial) Proportion: pn - E 6 p 6 pn + E ISBN-13: 978-0-13-446509-8
ISBN-10:
0-13-446509-1 x
where E = za>2
and pn =
B n
n
Mean: x - E 6 m 6 x + E
E = ta>2 # 9 0 0 0 0 C 1n - 12s2
x2R pn1qn1 C n1 triola_stats_stdy.indd 1 1 n2 1s not known2 6 s 6 C 1 x1 - x2 2 - E 6 1m1 - u2 2 6 1 x1 - x2 2 + E
where E = ta>2 s21 C n1 + s22 n2 df = smaller of n1 - 1 and n2 - 1.
alternative Cases for two Independent
Means:
If s1, s2 unknown but assumed equal, use pooled
variance s2p:
E = ta>2
where
s2p = s2p C n1 1n1 - + s2p
n2 12s21 + 1n2 - 12s22 1n1 - 12 + 1n2 - 12
s21
s22
+
C n1
n2 Matched Pairs:
d - E 6 md 6 d + E
sd
where E = ta>2
and df = n - 1
1n 1n - 12s2
x2L Hypothesis test: Procedure for testing claim about
a population characteristic.
null Hypothesis H0: Statement that value of population parameter is equal to some claimed value.
alternative Hypothesis H1: Statement that population parameter has a value that somehow differs from
value in the null hypothesis.
Critical region: All values of test statistic leading to
rejection of null hypothesis.
A ∙ Significance level: Probability that test statistic
falls in critical region, assuming null hypothesis is true.
type I error: Rejecting null hypothesis when it is
true. Probability of type I error is significance level a.
type II error: Failing to reject null hypothesis when
it is false. Probability of type II error is denoted by b.
Power of test: Probability of rejecting a false null
hypothesis. 9 780134 465098 n!
1n - r2!r! triola_stats_stdy_reprint.indd 2 triola_stats_stdy_reprint.indd 1 pn2qn2 + 0.90
0.95
0.99 • s unknown and normally distributed
population: use t
• s unknown and n 7 30: use t 1.645
1.96
2.575 • s known and normally distributed population:
use z
• s known and n 7 30: use z HyPOtHeSIS teSt: rIgHt-taIleD
Significance
level A Critical
Value 0.05
0.025
0.01
0.005 1.645
1.96
2.33
2.575 0.05
0.025
0.01
0.005 0.05
0.01
0.10 07/12/16 11:26 AM Critical
Value
−1.645
−1.96
−2.33
−2.575 Critical
Value z = If none of the above apply, use nonparametric
method or bootstrapping. yes no “There is not suffi
cient evidence to
warrant rejection
of the claim that …
[original claim].” “There is not sufficient
evidence to support
the claim that …
[original claim].” “There is sufficient
evidence to warrant
rejection of the claim
that … [original
claim].” 1 pn1 - pn2 2 - 1 p1 - p2 2
pq where p = “There is sufficient
evidence to support
the claim that …
[original claim].” C n1 x1 + x2 + pq
n2 and q = 1 - p n1 + n2
x1
x2
and pn1 =
and pn2 =
n1
n2 Two Means (independent samples):
Requires two independent simple random samples
with both populations normally distributed or
n1 7 30 and n2 7 30.
The population standard deviations s1 and s2 are
usually unknown.
Recommendation: do not assume that s1 = s2.
Test statistic (unknown s1 and s2, and not assuming
s1 = s2):
t = 1x1 - x2 2 - 1m1 - m2 2
s21 C n1 ±1.96
±2.575
±1.645 + s22 n2 df = smaller of n1 - 1 and n2 - 1 Hypothesis Testing (One Sample)
One Proportion: Requires simple random sample, np Ú 5 and nq Ú 5, and conditions for binomial
distribution. Critical Value Method of testing Hypotheses:
Uses decision criterion of rejecting null hypothesis only
if test statistic falls within critical region bounded by
critical value.
Critical Value: Any value separating critical region
from values of test statistic that do not lead to rejection
of null hypothesis.
P-value Method of testing Hypotheses: Uses
decision criterion of rejecting null hypothesis only if
P-value … a (where a = significance level).
P-value: Probability of getting value of test statistic
at least as extreme as the one found from sample data,
assuming that null hypothesis is true.
Left-Tailed Test: P-value = area to left of test statistic
Right-Tailed Test: P-value = area to right of test
statistic
Two-Tailed Test: P-value = twice the area in tail
beyond test statistic Test statistic: z = pn - p
pq where pn = x
n B n One Mean: Requires simple random sample and
either n 7 30 or normally distributed population. Hypothesis Testing (Alternative
Cases for Two Means with
Independent Samples)
Requires two independent simple random samples
and either of these two conditions:
Both populations normally distributed or n1 7 30
and n2 7 30.
Alternative case when S1 and S2 are not
known, but it is assumed that S1 = S2:
Pool variances and use test statistic Test statistic
t = x - mx
s
1n 1for s not known2 where df = n - 1
One Standard Dev. or Variance: Requires simple
random sample and normally distributed population.
Test statistic: x2 =
wher...

View
Full Document

- Fall '16
- Steve Poole
- Statistics, Probability