Lecture Notes Part 2
Density curves
Normal distribution
Scatterplot
Correlation
Least-squares regression
READING IPS Sections 1.3-2.3
Part 2: Density Curves
Descriptive statistics: We summarize only the data we have
collected
Inferential statistics: Based
STAT5301 Assignment 4
Due: September 30th, 2015
Answer all questions and submit the problems in order, making sure that the computer output and discussion
are placed together (do not put the computer output at the end of homework). Raw computer output is
Statistics 5301 Practice Midterm1 Questions
AU2012
1. Here are the 2007 health care expenditure per capita in the 35 countries with the
highest gross domestic product in 2007 in increasing order.
81
837
3323
4763
109
1035
3323
7285
233
1332
3357
286
1688
STAT 5301 Autumn 2016
Lecture 3: Simple linear
regression
Yuan Zhang
August 31,
2016
The study of two variables
Previously we discussed exploratory analysis with one
variable
We studied the patterns in distributions
Today we study two variables
We will st
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
The sampling distribution of counts and proportions
The Bernoulli trial and distribution
James B. Odei
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 N
Agenda: Introduction and Exploring Data
Introduction
Looking at data
Graphical summaries for data
Numerical summaries for data
Data Analysis ?
Statistics ?
We are interested in a summary of the data
A STATISTIC is any quantity whose value can be
calcu
WELCOME!
STAT 5301
Intermediate Data Analysis I
Reading
The Statistical Sleuth 3rd Editions:
Chapter 5: Comparisons Among Several Samples
Analysis of Variance F-test.
One-way ANOVA
ANOVA
We want to test the null hypothesis that there are
no differences a
WELCOME!
STAT 5301
Intermediate Data Analysis I
Rank-sum Test
Consider the artificial data:
Group A: 11 15 14
Group B: 14 18 19 16
Suppose that we want to test if the two groups are
different?
Sign Test Procedures
The Sign Test for Matched Pairs
As an alt
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
Confidence intervals for a population mean
Building a confidence interval (CI) for
James B. Odei
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 Neil A
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
Tests of significance/Hypothesis tests
Reading: The Statistical Sleuth (3ed), Chapter 1.
James B. Odei
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 Ne
WELCOME!
STAT 5301
Intermediate Data Analysis I
Reading
The Statistical Sleuth 3rd Editions:
Chapter 5: Comparisons Among Several Samples
Analysis of Variance F-test.
One-way ANOVA
ANOVA
We want to test the null hypothesis that there are
no differences a
WELCOME!
STAT 5301
Intermediate Data Analysis I
Reading
The Statistical Sleuth 3rd Editions:
Chapter 2:
Sections: 2.1-2.6
Two Populations
Essence of problem:
We have two samples - one from each of two
populations.
We wish to compare the populations on
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
The sampling distribution for the sample mean
The mean of the sampling distribution of the sample mean
James B. Odei
The Ohio State University
Department of Statistics
404 Cocki
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
Numerical summaries for data
(Describing distributions with numbers)
James B. Odei
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 Neil Avenue
Columbus,
is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m
sh
Th
https:/www.coursehero.com/file/7500319/Midterm1-sample-5301-solution/
is
ar stu
ed d
vi y re
aC s
o
ou urc
rs e
eH w
er as
o.
co
m
sh
Th
https:/www.coursehero.com/file/7500319/Midterm1-sam
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
R Coding Review
What is R?
James B. Odei
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 Neil Avenue
Columbus, OH 43210
Office: 435 Cockins Hall
Office H
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
Probability
James B. Odei
Chance experiments
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 Neil Avenue
Columbus, OH 43210
Office: 435 Cockins Hall
Off
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
Associations and the simple linear regression model
Motivation: Forbes boiling point of water dataset
James B. Odei
The Ohio State University
Department of Statistics
404 Cockin
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
Obtaining Data
James B. Odei
The steps of statistics
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 Neil Avenue
Columbus, OH 43210
Office: 435 Cockins
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
Random Variables
James B. Odei
Random variables (RVs)
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 Neil Avenue
Columbus, OH 43210
Office: 435 Cockins
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
Moving towards Statistical Inference
Performing statistical inference
James B. Odei
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 Neil Avenue
Columbus
Stat 5301 (Spring 2016)
Stat 5301 (Spring 2016)
Intermediate Data Analysis I
Density Curves and the Normal Distribution
James B. Odei
Density curves
The Ohio State University
Department of Statistics
404 Cockins Hall
1958 Neil Avenue
Columbus, OH 43210
O
WELCOME!
STAT 5301
Intermediate Data Analysis I
Probability: The Study of Randomness
Randomness
The value of a statistic varies from sample to sample.
A phenomenon is called random if we cant predict
individual outcomes exactly in advance. For example,
WELCOME!
STAT 5301
Intermediate Data Analysis I
Reading
The Statistical Sleuth 3rd Editions:
Chapter 2: 2.1 - 2.6
One-Sample t-Tools and the Paired t-Test
Two-Sample Inference
Chapter 3: 3.1 - 3.4
Robustness/Resistance of the Two-Sample t-Tests
Chapter 4:
WELCOME!
STAT 5301
Intermediate Data Analysis I
Reading
The Statistical Sleuth 3rd Editions:
Chapter 2: 2.1 - 2.6
One-Sample t-Tools and the Paired t-Test
Two-Sample Inference
Chapter 3: 3.1 - 3.4
Robustness/Resistance of the Two-Sample t-Tests
Chapter 4:
WELCOME!
STAT 5301
Intermediate Data Analysis I
Reading
The Statistical Sleuth 3rd Editions:
Chapter 5: Comparisons Among Several Samples
Analysis of Variance F-test.
One-way ANOVA
Chapter 6: Linear Combinations and Multiple
Comparisons of Means
P
A
C
Rul
WELCOME!
STAT 5301
Intermediate Data Analysis I
Review: Power
(EXAMPLE: Body Weight)
The weights of 25 randomly chosen nine-year-old kids were examined. It is known
is
that the weights are normally distributed with = 4 pounds. The sample meanX
42 pounds.
Comparing Models
Blue/White
insects_reduced <- insects_tidy
insects_reduced$variable
# [1] blue
blue
# [11] green green
# [21] yellow yellow
# Levels: blue green
blue
blue
blue
white white white
yellow yellow
white yellow
blue
white
green
white
green
whit
WELCOME!
STAT 5301
Intermediate Data Analysis I
Reading
The Statistical Sleuth 3rd Editions:
Chapter 5: Comparisons Among Several Samples
Analysis of Variance F-test.
One-way ANOVA
Chapter 6: Linear Combinations and Multiple
Comparisons of Means
P
A
C
Two
Rachel Woodfint
STAT 5301
Homework 2
1a.
Petal Length
The line would be straight if the distribution was normal. It does not seem reasonable to assume
that the distribution is normal.
2b.
Setosa
Rachel Woodfint
STAT 5301
Homework 2
This line is fairly str
STAT 5301 Autumn 2016, Homework 3
Due: Monday, September 19, 2016
Question 1.
All human blood can be ABO-typed as one of O, A, B, or AB, but the distribution of
the types varies a bit among groups of people. Here is the distribution of blood types for a
r
STAT 5301 Autumn 2016
Lecture 6: Random
variables
Yuan Zhang
September 14,
2016
Random
variables
Definition
Arandom variableis the numerical outcome of a random
trial
Notation: use capital letters near the end of the
alphabet X ,
Y , Z to denote r.v.s
Exa
Rachel Woodfint
Stats 5301
Homework 1
1. Columbus Temperature data
1a.
The histogram for this data is skewed to the left. It is not symmetric. It looks to be
unimodal in shape. However, there is a peak around the 30-40 on the X-axis.
1b. No, because obser
AS /HN 7761
Some Examples for Review for the First Midterm
Specific details that are important to some function
o For example: glucokinase activity is allosterically regulated by F6P, not G6P or F1,6P.
The regulatory protein keeps GK sequestered in the nu