STAT220 Problem Set 0
1.)
Questions. What is the difference between summary(sqrt(x) and sqrt(summary(x)?
Explain what order(x) and rank(x) produce.
summary(sqrt(x) first takes the square root of the case and then gives a summary on this new
case. Because
STAT 220, Winter 2016
Reminder:
Homework 0 (due January 8, 2016)
Although discussion of problems and approaches is allowed, each student is responsible
for submitting their own original work. Please submit responses for exercises 15. Be sure to keep
your
Lec5 2. April 27, 2005
Tests of Signicance
Outline:
General Procedure for Hypothesis Testing
Null and Alternative Hypotheses
Test Statistics
p-values
Interpretation of the Signicance Level
Tests for a Population Mean
Interpretation of p-values
Sta
Lecture 16
General Framework of Hypothesis Testing
Yibi Huang
Department of Statistics
University of Chicago
Textbook Coverage
Lecture 16 covers section 1.8 (skip the simulation) and some of 4.3
in the text.
1
Case Study: Gender Discrimination
In 1972, a
Events, Addition Rule, General Addition Rule
1. A card is selected at random from a deck of 52 poker cards. Let A be the event that the
selected card is a King, and let B be the event that the selected card is a Queen.
(a) Are events A and B disjoint?
(b)
Student Name (Print):
2017 Spring
(First Name)
STAT 22000
(Last Name)
Practice Midterm Exam
1. Do not sit directly next to another student.
2. Do not turn the page until told to do so.
3. You may use your calculator, and one letter-size formula sheet.
4.
Ch2 Probability
How CAN WE EVALUATE THE LIKELINESS OF AN OUTCOME?
Randomness
An event is random if outcomes are uncertain, but follow a regular distribution that is based
on a large number of repetitions.
Probability describes likeliness of an event over
Inverse Normal Calculations
Determine specific values based on a proportion of interest (work backwards)
Ex. Suppose SAT scores ~ N ( 500,1202 )
How much does a student need to score to be in the top 10% of all exam takers?
x500
Z =1.28=
120
x=653.6
Norma
Mean and Variances of Random Variables
Recall: The sample mean x is the average of a set of observations in a sample.
If X is a discrete random variable. With probability mass function p(x), the expected value of
X, E[X], is
E[X]=x1p(x1)+x2p(x2)+xkp(xk)
R
Median tends to be more robust than the mean.
If a very large number is added to the set (outlier, mean will shift, but median will have a less
noticeable change.
Measurements for spread:
Quartiles can be useful descriptors of how data are distributed (sp
Ex. A team of 3 people is randomly formed from a group of 2 managers, 12 analysts, and 20
technicians.
(a) Probability that team is composed of only analysts?
(b) Probability that both managers are on the team?
(c) Probability that 2 team members are from
Density curves (continued)
Where is the mean? The point that balances the distribution. (photo)
Correlation
How can we measure the relationship between two different variables?
Suppose there exists n samples, on which data are collected for variables x an
AP Stat Study Guide
Distributions Measures and Graphs
Center of distribution: mean and median
Population/parameter mean: (used for theoretical tests)
Sample/statistical mean: x-bar (used for real tests)
Spread of distribution: range and interquartile
Lecture 12
Binomial Distributions
Yibi Huang
Department of Statistics
University of Chicago
Outline
In Lecture 10, we will cover
Binomial distribution (3.4)
Please skip section 3.3 and 3.5.
1
Binomial distribution
Bernoulli Trials
A random trial having o
Lecture 10-11
Continuous Distributions and
Normal Distributions
Yibi Huang
Department of Statistics
University of Chicago
Outline
In Lecture 10, we will cover Section 2.5 and 3.1 in the text.
Continuous distribution (2.5)
Normal distribution (3.1)
Pleas
Lecture 5-6
Data Collection
Yibi Huang
Department of Statistics
University of Chicago
Outline
Lecture 5-6 covers mostly Section 1.1, 1.3, 1.4, 1.5 in the text.
Experiments (1.1, 1.3.4-1.3.5, 1.5)
Observational Studies (1.3.4-1.3.5, 1.4.1)
Sampling (1.3
Lecture 7-8
Probability
Yibi Huang
Department of Statistics
University of Chicago
Outline
In Lecture 7-8, we cover mostly Section 2.1-2.3 in the text.
Probability and Events (2.1)
General Addition Rule (2.1.2-2.1.3)
The Complement Rule (2.1.4-2.1.5)
C
Lecture 1&2
Exploratory Data Analysis I Numerical Data
Yibi Huang
Department of Statistics
University of Chicago
Outline
In Lecture 1& 2, we cover mostly Section 1.2 & 1.6 in the text.
Data and Types of Variables (1.2)
Histograms (1.6.3)
Mean and Media
Homework 1 (Due Friday, January 13, 2017)
STAT 220 Winter 2016
Reminder: Students may discuss concepts and approaches for these problems together. However,
each student is responsible for submitting solutions of their own original work. Show all work
and
Homework 2 (Due Friday, January 20, 2017)
STAT 220 Winter 2017
Reminder: Students may discuss concepts and approaches for these problems together. However,
each student is responsible for submitting solutions of their own original work. Show all work
and
STAT 220, Winter 2017
Reminder:
Homework 0 (due January 9, 2017)
Although discussion of problems and approaches is allowed, each student is responsible
for submitting their own original work. Please submit responses for exercises 15. Be sure to keep
your
Continuity correction
Usually, histogram values are the midpoint of a bar.
The region for k extends between k-0.5 and k+0.5
P(x<=k+0.5) P(x<=k) are the same for discrete case
However, the upper bound of P(x=k+0.5) includes the entire region (even under th
010917
Methods for randomization:
1. Random number generator
Runif() r-random; uni-uniform distribution;
2. Decide outcome based on number
>=1/2 accept; <1/2 reject;
3. Construct list of random numbers
Sort;
Split top 50%in list, bottom 50% in list;
Ideal
Recessional Velocity (in km/sec)
1000
750
500
250
0
250
0.0
0.5
1.0
1.5
2.0
Distance (in megaparsecs)
The Universe began about ten billion years ago in a violent explosion; every particle
started rushing apart from every other particle in an early super-d
3. The Big Bang.
Recessional Velocity (in km/sec)
1000
750
500
250
0
250
0.0
0.5
1.0
1.5
2.0
Distance (in megaparsecs)
The Universe began about ten billion years ago in a violent explosion; every particle
started rushing apart from every other particle in
Joe Geyer
STAT220 Problem Set 1
1. (a)
nclass=100 in graph used to answer the questions
(i) The distribution is fairly symmetric with a center near 60 ms.
false
(ii) The distribution is skewed to the left.
false
(iii) The distribution is skewed to the rig
Joe Geyer
Homework 2
2014
17 Oct.
1.) (a)
z = (x - ) / = (88 72) / 13.2 = 1.21 standard deviations from the mean
According to the standard normal distribution table, this z value correlates to .88686
1 - 0.88686 = 0.11314
0.11314 * 73 (total students) = 8
Joe Geyer
Homework 3
1.)
24 Oct. 2014
a) The experimental units are the undergraduate students. There are 40 total
experimental units.
b) The drivers reaction time is the response variable being measured.
c) The treatments factor was the use of a hands-fr
1. This problem is essentially the On Your Own part in Lab #6. Please complete the lab and
submit answers to the following questions.
(a) Make a histogram of price, which shows the population distribution of the sale price of all
2930 homes in the data se
3.4 Triathlon times, Part I. In triathlons, it is common for racers to be placed into age and gender groups.
Friends Leo and Mary both completed the Hermosa Beach Triathlon, where Leo competed in the Men,
Ages 30 - 34 group while Mary competed in the Wome
1.16 Income and education in US counties. The scatterplot below shows the relationship between per
capita income (in thousands of dollars) and percent of population with a bachelors degree in 3,143
counties in the US in 2010
(b) Describe the relationship
2.10 Guessing on an exam. In a multiple choice exam, there are 5 questions and 4 choices for each
question (a, b, c, d). Nancy has not studied for the exam at all and decides to randomly guess the
answers. What is the probability that: (a) the first quest
1. The 2008 General Social Survey asked, What do you think is the ideal number of children for a family
to have? The 678 females who responded had a median of 2, mean of 3.22, and standard deviation of
1.99.
(a) What is the point estimate of the populatio