How do we check for independence?
Recall that two events are independent when neither event influences the other. That is, knowing
that one event has already occurred does not influence the probability that t
Examples
Example 1
The probability of a student getting an A in this course is 0.25 (Not True!) and the probability of
getting a B is 0.30 (again Not True!). What is the probability of getting an A or a B? Ac
Conditional Probability
In the lesson on Examining Relationships we found conditional distributions from two-way
tables [for example, to find the percentage of students who did not smoked
cigarettes given gen
General Probability Rules
Rule 1: The probability of an impossible event is zero; the probability of a certain event is one.
Therefore, for any event A, the range of possible probabilities is: 0 P(A) 1
Rule 2
Basic Principles of Statistical Design of Experiments
Example
A group of college students believe that regular consumption of a special Asian tea could benefit
the health of patients in a nearby nursing home.
Designing Experiments
Example
Suppose some group claims that drinking caffeinated coffee causes hyperactivity college
students, ages 18 to 22. How would this group produce data to determine the validity of th
Designing Samples
Then entire group of individuals about which information is wanted is called the populations. It
ma be somewhat abstract. The part of the population actually examined to gather information i
Cautions about Correlation and Regression
Influence Outliers
In most practical circumstances an outlier decreases the value of a correlation coefficient and
weakens the regression relationship, but its also p
omparing Two Quantitative Variables
As we did when considering only one variable, we begin with a graphical display.
A scatterplot is the most useful display technique for comparing two quantitative variables
Comparing Two Categorical Variables
Understand that categorical variables either exist naturally (e.g. a persons race, political party
affiliation, or class standing), while others are created by grouping a q
Finding Outliers Using IQR
Some observations within our data set may fall outside the general scope of the remaining
observations. Such observations are called outliers. To aid in determining whether any valu
Spread (Variability)
The word spread is used as a synonym for variability. Three simple measure of variability are:
Example of Calculating Range and Interquartile Range (IQR)
1. The range is found by subtract
Describing Distributions with Numbers
Location
The word location is used as a synonym for the middle or center of a dataset. There are two
common ways to describe this feature.
1. The mean is the usual numeri
The distribution of a variable shows its pattern of variation, as given by the values of the
variables and their frequencies. The following data set, SAT_DATA.XLS,
or SAT_DATA.MTW (data from College Board) contains the mean SAT scores for each of the
50 U
Interpreting Confidence Intervals
The formula for confidence intervals remains the same:
Sample statistic Multiplier Standard error
In each of the scenarios described in this lesson, the sample statistic woul
Matched Pairs for Means
Paired Data
Simply put, paired data involves taking two measurements on the same subjects, called repeated
sampling. Think of studying the effectiveness of a diet plan. You would weigh
Comparing Two Independent Proportions
Example 3
In the same survey used for example 2, students were asked whether they think same sex
marriage should be legal. Well compare the proportions saying yes for mal
Comparing Two Independent Means - Unpooled and Pooled
We determine whether to apply "pooled" or "unpooled" procedures by comparing the sample
standard deviations. RULE OF THUMB: If the larger sample standard
General Ideas for Testing Hypotheses
Step 0: Assumptions
1. The samples must be independent and random samples.
2. If two proportions, then the two groups must consist of categorical responses. If two
means,
Comparing Two Groups
Previously we discussed testing means from one sample or paired data. But what about situations
where the data is not paired, such as when comparing exam results between males and females
Errors, Practicality and Power in Hypothesis Testing
Errors in Decision Making Type I and Type II
How do we determine whether to reject the null hypothesis? It depends on the level of
significance , which is
Hypothesis Testing for a Mean
Quantitative Response Variables and Means
We usually summarize a quantitative variable by examining the mean value. We
summarize categorical variables by considering the proporti
Hypothesis Testing for a Proportion
Ultimately we will measure statistics (e.g. sample proportions and sample means) and use them
to draw conclusions about unknown parameters (e.g. population proportion and p
Hypothesis Testing
Previously we used confidence intervals to estimate some unknown population parameter. For
example, we constructed 1-proportion confidence intervals to estimate the true population
proporti
Using Software To Calculate Confidence Intervals
Consider again the Class Survey data set (Class_Survey.MTW or Class_Survey.XLS) that
consists of student responses to survey given last semester in a Stat200 c
Constructing confidence intervals to estimate a population
mean
Previously we considered confidence intervals for 1-proportion and our multiplier in our interval
used a z-value. But what if our variable of in
Constructing confidence intervals to estimate a population
proportion
NOTE: the following interval calculations for the proportion confidence interval is dependent on
the following assumptions being satisfied
Toward Statistical Inference
Two designs for producing data are sampling and experimentation, both of which should employ
randomization. As we have already learned, one important aspect of randomization is to
Review of Sampling Distributions
In later part of the last lesson we discussed finding the probability for a continuous random
variable that followed a normal distribution. We did so by converting the observe
Sampling Distribution of the Sample Mean, xbar
The central limit theorem states that if a large enough sample is taken (typically n > 30) then
the sampling distribution of is approximately a normal distributi