Database Management
with R and SQL
(optional, Stat 20)
Whats a Database?
Its like a container that stores data
organized in tables and it may also include
other structures related to those tables.
Wit
The Book of R (Early Access), 2015 by Tilman M. Davies
2
NUME RICS , AR ITHMETIC,
ASSIGNME NT, AND THE VECTOR
In its simplest role, R can function as a
mere desktop calculator. In this chapter,
Ill di
AdvancedHighSchoolStatistics
PreliminaryEdition
Chapter 1
Data Collection
Scientists seek to answer questions using rigorous methods and careful observations. These
observations collected from the lik
Lab 1
Stats 10, Cha
Section 2A
Lab 1
Question 1
The unit of observation is 1 new born baby.
There are 23 different variables.
Categorical Variables: Birthday Month, Day of the Week, Birth Day
Marion J. Lane
004625013
Section 2A
Lab 3: Hot Hand
Question 1
About 0.43609 or 43.6% of Kobes attempted baskets were successful. Well assume that the
probability Kobe makes a basket is 43.6%.
Questio
Marion J. Lane
004625013
Stats 10 Cha
Section 2A
Lab 2: Batter Up
Question 1
The scatterplot shows that the relationship between runs and at bats is not too strong. The ability to
predict a teams runs
Lab 3
Lab 3
Question 1:
With 200 measures, the assumed population proportion (20%) for n=10 is within 12.2
percentage points.
With 200 measures each, the assumed population proportion (20%)
Lab 4
Lab 4
Question 1:
 The distribution of ages for pennies is skewed to the right and unimodal. The
mean is 9.63 but the median is 7 which means that there are higher values (such
as 39
Allison Goldstein
Stats 10 Disc. 4D
803951922
10/4/11
Lab 2: Batter Up
1. A scatter plot best shows the relationship between runs and at bats. The scatter plot displays a
moderately linear relationshi
Emily Lauterbach
Statistics 10
1.26.2017
Lab 1
Question 1: What is the unit of observation in the data? How many different variables are
recorded? List each variable and determine whether the variable
Stats 10
Lab 2
Stats 10
Lab 2
Question 1: Does the number of atbats predict the number of runs a team will score? Create a
graph that shows the relationship between runs and at bats from the Battin
Lab 1:
Question 1: What is the unit of observation in the data? How many dierent
variables are recorded? List each variable and determine whether the variable is
categorical or numerical.
There are 2
Yang, Da Eun Grace
UCID: 904453615
Brian Kim 4B
Lab 1: Baby Boom Lab Questions
Question 1:
Unit of observation in the data: each row represents different baby
Total Variables recorded: 22
Categorical
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Lab 3
Lab Assignment
1. The center of the histogram is at a temperature of about 98.6 or so degrees Fahrenheit.
Variability ranges from about 96 up to 101, with the
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Lab 2
Lab Assignment
1. The variable that gives if the mother is a smoker is Habit and the variable for weight of
the babies is weight.
2. One numerical variable is
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Homework 6
1. 4.1.1
a. (B) the observational unit is the student
b. (A) the explanatory variable is whether or not they have pulled an all nighter
c. (A) The respons
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Homework 3
1. 2.1.37
a. H0 : = 0.25, HA : < 0.25
b. 3/15 = = 0.2
2. 2.1.38
a. Pvalue = 0.46 (done in R)
b. We have very weak evidence against the null hypothesis, i
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Lab 5
1. There are 159 lizards in the sample and 15 variables. The categorical variables are:
island, island_g, habitat, id, perch_ty,
and treeperch
.
2. The center
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Homework 5
A. Board game survey
a. The largest sample size (4000) will have a tighter confidence interval compared
to the next largest (1000) compared to the smalles
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Lab 4
1. Random sample of 5: mean = 69.923, SD = 1.918
Random sample of 10: mean = 69.224, SD = 2.861
Random sample of 100: mean = 69.130, SD = 2.712
a. The first hi
Introduction to Statistical Methods for Life and Health Sciences
STATS 13

Winter 2017
Lauren Poston
504560368
Section 2C
Homework 4
1. Shark survey
a. Theory predicts that the mean of the null distribution will be 0.25. The mean I got
from my simulation was 0.249813.
b. Theory predicts
For r>.8, then it is a strong linear correlation. For .5<r<.8, it is a
moderate correlation. For r<.5, well forget about it.
Steps: Articulate Theory/Hypotheses Collect Data Analyze
Collected Data Rep
