Final Review
Linear and Multiple Regression
Example 1
For each of the following pairs of variables,
indicate whether you would expect a positive
correlation, a negative correlation, or a correlation
close to zero. Explain your choice.
a)
Exam 3 Review
Lessons 24 34
Exam 3
Hypothesis Tests (so far)
Parameter
Hypothesis Test
p
One-Proportion Z Test
p1 p2
Two-Proportion Z Test
p1, p2, , pk
Chi-Square Tests
(mu)
One-Sample T Test
d
Paired T Test
1 2
Two-Sample T Test
1 = 2 = = k
A
Exam 2 Review
Lessons 12 23
Example 1
Explain to someone who has not taken a statistics
course what statistical significance means.
One answer:
A result is considered statistically significant if that
result's difference from the null hypothesi
Lesson 42
Section 6.2
Model Selection
Model Selection
The best model is not always the most
complicated. Sometimes including variables that
are not evidently important can actually reduce the
accuracy of predictions.
However, it is not always
Lesson 41
Section 6.1
Multiple Regression
Multiple Regression
The principles of simple linear regression lay the
foundation for more sophisticated regression
methods used in a wide range of challenging
settings.
Multiple regression extends sim
Lesson 21
Section 2.8
Confidence Intervals
Confidence Intervals
Recall that a point estimate is a statistic
calculated from a sample that is used to estimate a
population parameter.
For example, the sample mean, x , can be used to
estimate the
Lesson 24
Section 3.3
The Chi-Square Distribution
One-Way Tables
Previously, we have looked at inference for single
proportions (one group) and for difference of
proportions (two groups).
Now we will develop a method for assessing a null
model
Lesson 22
Section 3.1
Inference for Single Proportions
Trial, Success, and Failure
A single event that leads to an outcome can be
called a trial. If the trial has two possible
outcomes, e.g. heads or tails when flipping a coin,
we typically la
Lesson 25
Section 3.3
The Goodness of Fit Test
Goodness of Fit Test
Suppose that you want to determine whether
observed sample frequencies differ significantly
from expected frequencies specified in the null
hypothesis.
This test can be addres
Lesson 26
Section 3.4
Testing for Independence of
Categorical Variables
Test of Independence
We can check whether one categorical variable is
associated with another categorical variable using
a chi-square test of independence.
For this test,
Lesson 29
Section 4.2
Paired Data Inference
Textbook Prices
Are textbooks actually cheaper online? Here we
compare the price of textbooks at UCLA's
bookstore and prices at Amazon.com.
dept
course
ucla
amazon
diff
1 Am Ind
C170
27.67
27.95
0.28
Lesson 23
Section 3.2
Difference of Two Proportions
Difference of Proportions
Consider the following questions:
Can giving students a reminder cause them to
be a little thriftier?
Do blood thinners have an effect on the survival
of patients
Lesson 18
Section 2.6
Probabilities Using
the Normal Distribution
Areas Between Two Bounds
For a normal distribution, N(,), the area
(probability) between two bounds is
P(a < X < b) = normalcdf(a, b, , )
a
b
2
Example 1
Suppose X ~ N( = 100, =
Lesson 14
Section 2.3
The Hypothesis Test Procedure
Make-Up Final
A (semi) well-known story* goes something like this:
Four students missed the final exam for their statistics
class. They went to the professor and said, "Please,
oh please, let
Lesson 12
Section 2.1
Randomization Case Study:
Gender Discrimination
Example 1
a) Suppose you flip a coin 100 times, getting 51
heads and 49 tails. Would that be evidence of an
unfair coin?
b) Suppose you flip another coin 100 times, getting
Lesson 11
Section 1.6
Categorical Data
Frequency
The first step to organizing categorical data is to
count the number of data values there are in each
category of interest.
We can organize these counts (or frequencies)
into a frequency table,
Lesson 15
Section 2.3
Randomization Case Study:
CPR Patients
Two-Sided Hypotheses
Earlier we explored whether women were
discriminated against (Lesson 12) and whether a
simple trick could make students a little thriftier
(Lesson 13). In these
Lesson 10
Section 1.6
Graphing Numerical Data
Graphs of Numerical Data
One major reason for constructing a graph of
numerical data is to display its distribution, or the
pattern of variability displayed by the data of a
variable.
Three popular
Lesson 13
Section 2.2
Randomization Case Study:
Opportunity Cost
Opportunity Cost
How rational and consistent is the behavior of the
typical American college student?
For this case study, we'll explore whether college
student consumers always
INDEPENDENT
Independent there is no relationship between two variables
Distribution of one variable is same for all categories of another
variable
NOT Independent there is a relationship between two variables
Distribution of one variable is different fo
Lesson 2
Appendix A.2
Conditional Probability
Multiplication Rule (Independent)
Two events are independent if the outcome of
one does not affect the probability of the other
event.
Consider two independent events A and B with
individual probab
Lesson 5
Section 1.4
Sampling Methods
Data Collection
There are two primary types of data collection:
observational studies and experiments.
Generally, data in observational studies are
collected only by monitoring what occurs, while
experimen
Lesson 3
Sections 1.1 and 1.2
Understanding Data
So What Is Statistics?
Let us begin with the Course Description:
"Introduction to the basic principles of probability,
descriptive statistics, and inferential statistics.
Topics include properti
Lesson 4
Section 1.3
Study Beginnings
Populations and Samples
The population is the complete collection of
individuals or objects that you wish to learn about.
To study larger populations, we select a sample. The
idea of sampling is to select
Lesson 6
Section 1.5
Experiments
Experiments
Studies where the researchers assign treatments
to cases are called experiments. When this
assignment includes randomization (such as coin
flips) to decide which treatment a patient receives,
it is
Lesson 8
Section 1.6
Averages and Variation
Summarizing Data
The distribution of a variable is the overall pattern
of how often the possible values occur. For
numerical variables, three summary characteristics
of the overall distribution of th
Lesson 1
Appendix A.1
Defining Probability
Probability
Probability measures the
uncertainty that is associated
with the outcomes of a
particular random process, a
planned operation carried out
under controlled conditions.
Flipping coins or rol
The modern age is called an age of statistics. The
statistics, these days, are most extensively and
effectively used in all fields of social life and as such,
like the ability to read and write, have become an
essential part of human life. Statistics prov