STAT 103a
Homework #7
Investigate the relationship between Cereal Variables and per capita GNP (use regression to determine factors
which predict log(GNP/capita)
1. Identify variables, response & predictors
Response variable: Log-transformed GNP per Capit
STAT 103: Homework #3
1. random: unbiased (every sample of size n has equal chance of being selected) & independent (selection
of 1 unit has no influence on the selection of other units)
systematic: take every x unit that comes along
stratified: stratify
Examples :
PROBABILITY
Discrete
Chapter 3 in Cartoon Guide
STRONGLY RECOMMENDED
Toss a coin:
S = cfw_H,T.
Watch a tree for a year and see if it dies :
Probability is crucial to statistical inference
S = cfw_Dead, Alive.
Inferences are always expressed i
STATISTICS
We are given the following data:
Number of businesses: n = 43
Population proportion: = 0.31
Sample proportion: = 0.19
Level of significance: = 0.01
Null and alternative hypotheses are:
0 : = 0.31
: < 0.31
1. We use this data and the z-table to
Hi, I hope this helps! Good luck
Explain the trend of the natural rate of unemployment in the US.
Natural rate of unemployment is, by definition, a combination of frictional, structural, and
surplus unemployment. It is important to understand that even t
Bruno Jednaak, STAT 103, HW4
1.
(a) Statistical model: Binomial
Each individual question represents one Bernoulli Trial (only two possible outcomes
right or wrong). As there are 15 questions, this means the Bernoulli Trial is repeated
15 times (and the t
Bruno Jednacak, STAT 103, HW1
Study finds that well-designed classrooms boost
student success
New research published by Salford University has suggested that the layout, construction and
design of classrooms has a significant impact on achievements in rea
Bruno Jednacak, STAT 103, HW3
1. The sampling used are the following:
(a) Systematic sampling
Because every fifth driver is interviewed.
(b) Clustered sampling
Because population is first divided into clusters (trees) and then all individuals
(oranges) fr
Bruno Jednaak, STAT 103, HW5
1. (a) According to the poll, the proportion of Democrats that support sending ground troops is:
115
296
= 0.3885 = 38.85%
The proportion of Republicans that support sending ground troops is:
199
265
= 0.7509 = 75.09%
The poll
Bruno Jednacak, STAT 103, HW8
MONKEYS PROBLEM
(a) Using MINITAB, we make boxplot showing the distribution of correct choices by
treatment group.
Boxplot of Score (of 20 trials)
20
Score (of 20 trials)
15
10
5
0
Motivation
o
1h
u
r iv
ep
d
r
ed
24
ur
ho
Gr
Bruno Jednacak, STAT 103, HW6
1. (a) In order to answer this question, we are going to use the following graph:
For the first part, we are just going to look at the scatterplot of the data and the regression line. We
have the number of powerboat registrat
Bruno Jednacak, STAT 103, HW2
1. Below is the scatterplot of the following data: cellphone use (per 100 people) and internet access
(percent of population).
Scatterplot of Cellphone vs Internet
200
Cellphone
150
100
50
0
0
10
20
30
40
50
60
70
80
90
Inter
Bruno Jednaak, STAT 103, HW7
Problem 2: GNP and Cereals by Country
I am going to examine the relationship between Cereal Variables and per capita GNP using
regression.
STEP 1: Identifying variables
Our response variable will be log-transformed GNP per Cap
Madison Bickel
Introduction to Statistics: Homework #1
Stat 103
1. Analysis finds California Students attend school more than U.S. Peers (L.A.
Times)
http:/www.latimes.com/local/education/la-me-school-attendance20140902-story.html
Population: United State
Madison Bickel
TA: Michelle Roh
Hw #5
1. a. There was 75% support from Republicans and 30% support from Democrats.
Therefore, the poll estimated the difference in support to be 45%.
b. Using Minitab, the 95% confidence interval for the difference in propo
Sampling and Experiments
If in doubt, consult with a Statistician : you can save
immense heartache, loss of resources, etc. by checking
with an expert first!
Research Design : Some General Advice
Decide What you Want to Know : Explicitly define the
para
Example :
Brain/Body weight
relationship in
mammals. Last time
we used regression
on the logs of brain
and body weight :
Once Again the simple linear leastsquares regression model :
y b0 b1 x
Simple Linear Least-Squares regression assumes that
Fitted Line
Scatterplot Notation :
Data Relationships
The Horizontal axis is ALWAYS
called the X axis.
Today: describing relationship of two
quantitative variables :
Y
The Vertical axis is ALWAYS
called the Y axis.
Scatterplots
Correlation
Regression
X
Associati
Probability in Practice
(The trick is knowing when to apply which probability rule!)
Probabilities of events so far only add to 73% or 81%, investigation reveals that we forgot the last category (Not at
All : 26% and 10%)
Suggestions :
Question : what is
Probability Models for Count Data
Bernoulli Random Variables
Example : Evolution vs. Creationism. A
Gallup poll conducted in 2008 (sample size n
1000 people nationwide) found that 44% of
American adults believed that humans were
created directly by God w
Central Limit Theorem Interpretation 1
The Central Limit Theorem
Revisited
Take sample of size n from any distribution and calculate
the sample mean. Repeat this process many times. As
Two more views
long as n is large enough :
1. The histogram of all the
Example : IQ Tests.
The general population has a mean
IQ score of 100 and the population
standard deviation of scores is
16 .
Hypothesis Testing
Like proof by contradiction : (think back to geometry)
Example : there are infinitely prime numbers
(attribu
Boxplot of illiteracy by Gender
Pooled Standard Deviation Estimate
50
40
30
Rate
Example : World Poverty Data,
2000. Compare worldwide illiteracy rates
between men and women at the 99%
confidence level for countries with a GNI per
capita of at least $500
Theoretical Regression Model
Inference for Regression
(Return of the Regression!)
Mean of Y is a linear function of X.
Inference for Regression answers questions like
Y o 1 X
How strong is the evidence that there is a real
correlation between two varia
Test for the correlation coefficient
Announcements
Midterm Thursday, 10/21 in OML202
(rho or row) is the true correlation,
estimated by r the sample correlation
Closed note, closed book. Any complicated
formulas needed will be provided.
Suppose we want
Introductory Statistics 09-01-11
Statistics
o Quantifying variation
o Distinguishing fact from fiction (e.g. firefighting is not the most dangerous
occupation)
o Collecting, organizing, & interpreting data
Example: Hershey Kisses & rocks
o Appeared to be
Introductory Statistics 09-06-11
Mean
o The average
o The balancing point
o Add all data together then divide by how many things you added
Median
o For odd # of observations, its the middle one
o For even # of observations, its the one between the two mid