8FFL (SBQIJDT
$PQZSJHIU D
7JWJBO FX "MM 3JHIUT 3FTFSWFE
0WFSWJFX
1MPUUJOH JO HFOFSBM
" QMPU JT DSFBUFE JO CBTF 3 HSBQIJDT CZ STU DBMMJOH B IJHIMFWFM
GVODUJPO B GVODUJPO UIBU DBO DSFBUF BO PCKFDU BOE EPFT OPU OFFE
BOZ PUIFS GVODUJPO UP QSFDFEF JU
U
The Book of R (Early Access), 2015 by Tilman M. Davies
2
NUME RICS , AR ITHMETIC,
ASSIGNME NT, AND THE VECTOR
In its simplest role, R can function as a
mere desktop calculator. In this chapter,
Ill discuss how to use the software for arithmetic. Ill also
AdvancedHighSchoolStatistics
PreliminaryEdition
Chapter 1
Data Collection
Scientists seek to answer questions using rigorous methods and careful observations. These
observations collected from the likes of field notes, surveys, and experiments form the
ba
Marion J. Lane
Stats 10, Cha
Section 2A
Lab 1
Question 1
The unit of observation is 1 new born baby.
There are 23 different variables.
Categorical Variables: Birthday Month, Day of the Week, Birth Day, Gender, Premature,
Low Birth Weight, Marital Status,
Marion J. Lane
004625013
Section 2A
Lab 3: Hot Hand
Question 1
About 0.43609 or 43.6% of Kobes attempted baskets were successful. Well assume that the
probability Kobe makes a basket is 43.6%.
Question 2
Kobes typical streak length is the mode of this dis
Marion J. Lane
004625013
Stats 10 Cha
Section 2A
Lab 2: Batter Up
Question 1
The scatterplot shows that the relationship between runs and at bats is not too strong. The ability to
predict a teams runs by using at bats is relatively weak. Several observati
Emily Lauterbach
Lab 3
Question 1:
With 200 measures, the assumed population proportion (20%) for n=10 is within 12.2
percentage points.
With 200 measures each, the assumed population proportion (20%) for n=100 is within
4.07 percentage points.
With 200 m
Lab 4
Emily Lauterbach
Question 1:
 The distribution of ages for pennies is skewed to the right and unimodal. The
mean is 9.63 but the median is 7 which means that there are higher values (such
as 39) affecting the mean and skewing it.
Question 2:
 We s
Allison Goldstein
Stats 10 Disc. 4D
803951922
10/4/11
Lab 2: Batter Up
1. A scatter plot best shows the relationship between runs and at bats. The scatter plot displays a
moderately linear relationship between runs and at bats and from the graph, one can
Emily Lauterbach
Statistics 10
1.26.2017
Lab 1
Question 1: What is the unit of observation in the data? How many different variables are
recorded? List each variable and determine whether the variable is categorical or numerical.
There are 22 different v
Emily Lauterbach
Stats 10
Lab 2
Question 1: Does the number of atbats predict the number of runs a team will score? Create a
graph that shows the relationship between runs and at bats from the Batting11 Collection. What
does the Graph say about the abili
Lab 1:
Question 1: What is the unit of observation in the data? How many dierent
variables are recorded? List each variable and determine whether the variable is
categorical or numerical.
There are 22 different variables.
Categorical: DOW, Gender, Prema
Yang, Da Eun Grace
UCID: 904453615
Brian Kim 4B
Lab 1: Baby Boom Lab Questions
Question 1:
Unit of observation in the data: each row represents different baby
Total Variables recorded: 22
Categorical
Numerical
Bmonth
Apgar5
Bday
BirthWeight
DOW
Gest
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Lab 3
Lab Assignment
1. The center of the histogram is at a temperature of about 98.6 or so degrees Fahrenheit.
Variability ranges from about 96 up to 101, with the most common values around the
center (97.599.5). Its s
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Lab 2
Lab Assignment
1. The variable that gives if the mother is a smoker is Habit and the variable for weight of
the babies is weight.
2. One numerical variable is Gained because it gives quantitative data for how much
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Homework 6
1. 4.1.1
a. (B) the observational unit is the student
b. (A) the explanatory variable is whether or not they have pulled an all nighter
c. (A) The response variable is GPA
d. The explanatory variable is catego
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Homework 3
1. 2.1.37
a. H0 : = 0.25, HA : < 0.25
b. 3/15 = = 0.2
2. 2.1.38
a. Pvalue = 0.46 (done in R)
b. We have very weak evidence against the null hypothesis, i.e. that the proportion
of diseased sharks is less than
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Lab 5
1. There are 159 lizards in the sample and 15 variables. The categorical variables are:
island, island_g, habitat, id, perch_ty,
and treeperch
.
2. The center of the graph is near 5 g, and the histogram can be desc
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Homework 5
A. Board game survey
a. The largest sample size (4000) will have a tighter confidence interval compared
to the next largest (1000) compared to the smallest (250). This is because the
larger the sample size, th
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Lab 4
1. Random sample of 5: mean = 69.923, SD = 1.918
Random sample of 10: mean = 69.224, SD = 2.861
Random sample of 100: mean = 69.130, SD = 2.712
a. The first histogram (sample size = 5) looks most like a normal dist
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Homework 4
1. Shark survey
a. Theory predicts that the mean of the null distribution will be 0.25. The mean I got
from my simulation was 0.249813.
b. Theory predicts that the SD will be [(0.25*0.75)/15] = 0.111803.
c. Th
Background Exam
Yingdi Zheng
September 22, 2016
library(gdata)
# gdata: read.xls support for 'XLS' (Excel 972004) files ENABLED.
#
# gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.
#
# Attaching package: 'gdata'
# The following object is
For r>.8, then it is a strong linear correlation. For .5<r<.8, it is a
moderate correlation. For r<.5, well forget about it.
Steps: Articulate Theory/Hypotheses Collect Data Analyze
Collected Data Repeat, Ask it differently, collect more evidence,
and per
For r>.8, then it is a strong linear correlation. For .5<r<.8, it is a
moderate correlation. For r<.5, well forget about it.
Steps: Articulate Theory/Hypotheses Collect Data Analyze
Collected Data Repeat, Ask it differently, collect more evidence,
and per
For r>.8, then it is a strong linear correlation. For .5<r<.8, it is a
moderate correlation. For r<.5, well forget about it.
Steps: Articulate Theory/Hypotheses Collect Data Analyze
Collected Data Repeat, Ask it differently, collect more evidence,
and per