Ch 14 (Random Variable)
Recall:
1.
x+y x y
= +
a
a a
,
x+ y+z x y z
= + +
a
a a a
Definition 1 (Random variable):
A random variable X (capital X), is a variable whose values x (little x) are outcomes of a
random phenomenon.
Note: In this Chapter, we will
Some Business Rules that need to be incorporated into the Data Dictionary
Make sure you read the comments listed on the next worksheet; these comments identify additional rules/guidelines.
Constraints for the Data Dictionary
1
2
3
4
5
6
7
8
9
10
11
12
13
Ch. 3: Percentiles, Boxplots
& z-scores
Percentiles divide a data set into hundredths
or 100 equal parts
Data sets have 99 percentiles: P1, P2, P3, P99
Quartiles divide the data set into 4ths
Q1 = 25th Percentile
Q2 = 50th Percentile = Median
Q3 = 75t
Measures of Central Tendency
We describe a data set by examining its
shape, location and dispersion. Weve looked
at shape; now we look at location.
If we want to describe a set of data with one
number, what do we use?
A measure of central tendency/locatio
STA 215
Statistical Inference
Probability Distributions &
Expectation
Factorials
k! = k factorial = product of first k
integers
3! = 3 * 2 * 1 = 6
10! = 10*9*8*7*6*5*4*3*2*1= 3,638,800
0! = 1 (by Definition)
Permutations
Permutation of r objects from a
A
Two Populations
We cannot pool the sample standard
deviations (s1 and s2) because we cannot
assume that they are two estimates from
the same population standard deviation ().
Thus, we use the non-pooled two sample
t-test
t=
1 2 (1 2 )
2
1
1
2
+ 2
2
We hav
The Normal Distribution
The Coca Cola Company filling machines are
adjusted to pour 12 ounces of Coca Cola in each
12-ounce can. However, the actual amount of soda
poured into each can is not always 12 ounces; it
varies from can to can.
It has been observ
Sampling Distributions &
Estimation ( known)
We need the characteristics of the sampling
distribution for the sample means.
Then, we can find the probability of occurrence of
any particular event.
If the sampling distribution of the random
variable is no
Proportions
What proportion of all registered voters believe
that Gov. Christie knew about BridgeGate?
What proportion of the population of
households has more than one wage earner?
(rho): proportion of people/entities in the
population that have the att
Sampling Distributions
Mortality Tables permit Actuaries to
determine P(person at any particular age will
live to a specified number of years)
Insurance Companies use these probabilities to
determine life insurance premiums, annuity
payments, etc.
When n
STA 215
Professor Ochs
Problem Set #1
Due at Start of Class: 3 Feb 2015
WHERE APPROPRIATE YOU MUST SHOW YOUR WORK, ONLY PRESENTING FINAL
ANSWERS IS NOT NOW NOR WILL EVER BE ACCEPTABLE IN THIS COURSE
D
1
Problem to Turn In
1) First create the contingency table. The rows or columns may be reversed in your version, but I will use the
afliation as rows and the choice as columns.
Democrats
Republicans
Independents
Bush
21
337
102
460
Kerry
231
17
141
389
Ot
High-powered hurricanes are known as major hurricanes, with categories 3, 4, or 5.
From the data below which counts major hurricanes in the decade ending on the
given date, determine if there is evidence for a ch
Ch 13 (Conditional Probability and the Bayes Theorem)
Recall:
10% of $254 = $25.4 (tax in an Asian/European country)
7% of $254 = $17.78 (USA tax)
Example 1
In a specific city, about 1% of drivers drink beyond what they are legally allowed (e.g., in
100,0
Ch 18 (t-test)
Get Satistica ready. We will do things like the following:
t-test =
n
Two-tail
Probability
(by A-68 t-table)
Two-tail
Probability
(by Statistica)
observedexpected
. Assume t-test = 3.078
SE
2
20%
3
8%
4
6%
10
1.5%
100
0.5%
3000
0%
20%
9%
5.
Chapter 3 Trim (Mean, Standard Deviation, and Quartiles)
n
x
=
1
x
n 1 i
n
p. 57:
x
1
x = 1 ( x + x + + x n ) .
n 1 i n 1 2
is called the mean of the data set.
Example 1:
(a) Incomes of 10 families in a community:
55, 57, 58, 59, 60, 60, 61, 62, 63, 65 (
Lecture 2
Ex. 1
In 1975, at U.C. Berkeley, the Dean of Graduate School noticed that about
45% of male applicants were admitted and about 30% of female applicants
got in. The difference was about 15%.
This looked like a case of sex discrimination. The Dean
STA 215
Professor Ochs
Problem Set #5
Due at Start of Class: 13 Mar 2015
WHERE APPROPRIATE YOU MUST SHOW YOUR WORK, ONLY PRESENTING FINAL
ANSWERS IS NOT NOW NOR WILL EVER BE ACCEPTABLE IN THIS COURSE
STA215: Statistical Inference
Section 3, 4
Course Syllabus
Textbook: DeVeaux, Welleman, Bock: Intro Stats, 4th Ed.
Policy on Older Editions: Students may use earlier versions of the text so long as they take
full
STA 215
Professor Ochs
Problem Set #4
Due at Start of Class: 27 Feb 2015
WHERE APPROPRIATE YOU MUST SHOW YOUR WORK, ONLY PRESENTING FINAL
ANSWERS IS NOT NOW NOR WILL EVER BE ACCEPTABLE IN THIS COURSE
STA 215
Professor Ochs
Problem Set #3
Due at Start of Class: 20 Feb 2015
WHERE APPROPRIATE YOU MUST SHOW YOUR WORK, ONLY PRESENTING FINAL
ANSWERS IS NOT NOW NOR WILL EVER BE ACCEPTABLE IN THIS COURSE
STA 215
Professor Ochs
Problem Set #2
Due at Start of Class: 10 Feb 2015
WHERE APPROPRIATE YOU MUST SHOW YOUR WORK, ONLY PRESENTING FINAL
ANSWERS IS NOT NOW NOR WILL EVER BE ACCEPTABLE IN THIS COURSE
1
Problem to Turn In
a) (dice) = 43 = 64
b) The minimum value will occur when each die shows its minimum, 1 + 1 + 2 = 4, and the maximum value when
each die has its maximum, 7 + 7 + 8 = 22. The full sample space contains only even numbers, as there is no
Sampling and Variables
Does varying one or more Explanatory
Variables (Independent Variables/Factors) have
an effect on a Response Variable (Dependent
Variable)
o Does varying the explanatory variable cause a change
in the response variable?
o Observation
Estimation & the t Distribution
Let be some number between 0 and 1.
1 is the confidence level
Z = the z-score that has area to its RIGHT
under the standard normal curve
Z.05 = the z-score that has area 0.05 to its RIGHT
under the normal curve
Z/2 = the
STA 115
Total points: 35
Quiz#1
NAME:
1. Identify the variable as Quantitative or categorical and state one possible graphical
method to display the distribution of this variable. [4]
Quant/Cat/Borderline
Display
A) Salaries of employees at TCNJ
Quant
His
t Interval
o Draw an SRS of size n from a population having unknown mean . A level C confident
interval for is
o
is considered the upper critical value for the t(n 1) distribution
is found in Table C by using the degrees of freedom and the sign. level
t
Calculating the z Test Statistic
o To test the hypothesis based on an SRS of size n from the a population with unknown
mean and known standard deviation , the z test statistic is
P-Value
o The probability, computed supposing is true, that the test statis
Proportions Review
Sample Proportions
o Tests and confidence intervals for a population proportion p when the data are an SRS
of size n are based on the sample proportion . When n is large, has approx. the
normal distribution with mean p and standard devi