Normal Distributions and Probabilities (4.10)
Often, the histogram for a set of data is mound-shaped and symmetric. Normal or
Gaussian curves may be used as a smooth approximation to such histograms. A (continuous) variable whose distribution follows a no
Inference about (Chapter 5)
The purpose of statistical inference is to draw conclusions about a population from a
set of data.
There are two basic types of statistical inference:
1. Condence Intervals: making statements with some degree of condence (thi
Measures of Center (3.4)
This handout introduces and compares several measures of the center of a distribution, such
as the sample mean, median, mode, and trimmed mean.
1. Sample Mode : The sample mode is the most frequently occurring
sample.
0
0
Example:
Homework #1:
P.13 (1.2, 1.4), P.48-54 (2.2, 2.12, 2.16, 2.22, 2.29), and the two additional problems
below. Due Friday, September 6 .
1. Additional Problem 1: An online poll at a popular web site asked the following:
A nationwide ban of the diet supplemen
Homework #10: P.336-352 (6.10, 6.26, 6.36, 6.60, 6.61), and additional problems 1, 2, 3, 4, & 5 below.
Additional Problem 5 is REQUIRED FOR GRADUATE STUDENTS ONLY.
Due Monday, November 25 . This homework will count as 2 homeworks.
Some Notes on the Homewo
Solutions - Homework #7
1. Problem 5.3
(a) The population of interest is all face masks produced by this particular manufacturer.
(b) An answer to the question posed, namely Is the manufacturers claim valid?,
would involve the testing of a hypothesis. Spe
Solutions - Homework #8
1. Problem 5.38: This question should have asked for part (b) rst and then part (a), so
I will do these in reverse order.
(b) Both a normal quantile plot and histogram (better for discerning shape than a
boxplot) are shown below. I
Solutions - Homework #10
1. Problem 6.10: We could run a t-test, but it would not be appropriate for three reasons.
First, the underlying distributions of Mg and Eu concentrations are likely not normal
since the standard deviations are so large relative t
Solutions - Homework #9
1. Problem 5.56
(a) A boxplot and normal quantile plot of these health care expenditures are shown
below. Based on either plot, it is clear that these data are highly right skewed with
one mild and two extreme outliers, so are not
Homework #8:
P.281-288 (5.38, 5.46, 5.64, 5.66, 5.72), & the 3 additional problems
below. Due Friday, November 1 .
Some Notes on the Homework :
For Problem 5.64: the mean and standard deviation of the 15 mercury concentrations are: y = 1.466
3
3
mg/m and
Homework #9:
P.284 (5.56), and additional problems 1, 2, and 3 below.
Additional Problem 3 is REQUIRED FOR GRADUATE STUDENTS ONLY.
Due Wednesday, November 6 .
Some Notes on the Homework :
The data for Additional Problem 2 are available on the course webp
Homework #4:
P.203-208 (4.1, 4.7, 4.26, 4.34), and the four additional problems below.
Due Friday, September 27 .
Some Notes on the Homework :
Organize your homework and place the problems in the order in which they were assigned.
Please DO NOT put extr
Homework #5:
P.209-219 (4.36, 4.42, 4.52. 4.111), and the ve additional problems below.
Due Friday, October 11 .
Some Notes on the Homework :
For all problems involving probability notation, use the notation very carefully. Come see me if you
have questi
Homework #2:
P.117-128 (3.2, 3.10, 3.11, 3.19, 3.21, 3.39), and the three additional problems
below. Due Friday, September 13 .
Some Notes on the Homework :
For problems 3.2, 3.10, 3.11, and Additional Problem 1, the data sets can be found on the course
Homework #3:
P.125-137 (3.30, 3.41, 3.42, 3.45), & the additional problems below.
[Additional Problems 1-3 are required for all students, & Additional Problem 4 is
REQUIRED FOR GRADUATE STUDENTS ONLY.] Due Friday, September 20 .
Some Notes on the Homework
Homework #7:
P.276-286 (5.3, 5.8, 5.12, 5.26, 5.32, 5.34, 5.65), & the 3 additional problems
below. Due Friday, October 25 .
Some Notes on the Homework :
Organize your homework and place the problems in the order in which they were assigned. Please DO
NO
Conditional Probability and Independence (4.4)
Example: In many samples of heavy metals from a river near an industrial plant, it was
found that:
32%
contained toxic levels of lead,
16%
contained toxic levels of mercury,
38%
contained toxic levels of lead
Surveys and Experimental Studies to Collect Data (Chapter 2)
Chapter 2 deals with various aspects of using sampling methods to produce data. A primary
goal in sampling is to produce meaningful data; hence, one should keep in mind specic
questions when con
Probability (4.1-4.3)
Recall: A goal of statistics is to make inferences about a population based on a sample.
Since only the sample is used, there is uncertainty in such inferences.
Probability is the mathematics of uncertainty.
Example: Do you think we
Random Sampling and Sampling Distributions (4.11, 4.12)
Recall: A sample of n measurements taken from a population (of size N > n) is a random
sample if every sample of size n has the same probability of being chosen.
The goal of random sampling is to av
Summarizing Data (Chapter 3)
Chapter 3 deals with methods of summarizing data both numerically and visually. The
data summary step in a statistical analysis is also known as descriptive statistics or
exploratory data analysis (EDA).
Goal of Data Summary:
Test of Signicance for (5.4-5.6)
These notes are intended as a reference to some of the basic ideas governing tests of signicance (hypothesis testing). In addition to summarizing the basic theory behind hypothesis
testing, this handout seeks to make the r
WHAT YOU NEED TO KNOW - Chapters 1-4.5
1. Know the basic steps involved in statistical methodology: collecting, summarizing, analyzing,
and presenting data.
2. Know how to identify the population and sample in a study, and what the benets and
drawbacks of
Random Variables and Probability Distributions (4.6-4.9)
Recall from Chapter 1 that there are two basic types of random variables:
1. A categorical random variable is one which records into which of several categories
an observation falls (Ex: gender, pol
Measures of Variation (3.5, 3.6)
60
80
100
120
140
160
60
20
0
50
0
40
40
Frequency
80
100
100 150 200 250 300 350
Frequency
150
100
50
0
Frequency
200
250
Consider the following three relative frequency histograms. All have mean and median equal
to 100 a
Solutions - Homework #1
1. Problem 1.2
(a) The population of interest is the set of all possible points within the suspect area
at which radioactivity could be measured.
(b) The sample in this problem is the set of 200 points where the radioactivity measu
Solutions - Homework #2
1. Problem 3.2
(a) A pie chart would not be an appropriate graphical summary for these data because
the data for each year are numbers of car types (in thousands of vehicles) rather
than percentages. We could compute the proportion
Solutions - Homework #3
1. Problem 3.30
Histogram of Number of Trees
(a) A relative frequency histogram
for these data is shown to
the right.
25
20
y
= (1/n)
Percentage
(b) The sample mean of these
data is:
yi
15
10
= (1/70) (7 + 8 + + 8)
= (1/70) (541)
=
Solutions - Homework #4
1. Problem 4.1
(a) As this announcement does not appear to be based on the frequency of occurrence
of rises and falls in beef prices, this is a subjective probability interpretation.
(b) Since the 0.998 probability appears to be ba
Solutions - Homework #5
1. Problem 4.36
(a) The life of a battery is continuous because time is measured along a continuum.
(b) Since the number of rain delays is a count and thus must be an integer value, it is
a discrete random variable.
(c) Since the t