1
Statistics and samples
1.1
What is statistics?
Biologists study the properties of living things. Measuring these properties is a challenge, though, because no two individuals from the same biologica
Au
g
Sep
2
J uly
t
us
Deaths by:
Infectious disease
Wounds
Other causes
er
temb
June
May
Marc
Oc t
ober
April 1854
m
be
r
h
ve
No
b
Fe
ru
a
ry
ber
Decem
Janu
ary 18
5
5
Displaying data
T
he human eye
Appendix 3. Statistical tables
This appendix gives numerical values for a few of the most commonly used probability
distributions. More can be found in references such as Rohlf and Sokal, Biostatistic
Goals of experiments
! Eliminate bias
! Reduce sampling error (increase
precision and power)
Controls
! A group which is identical to the
experimental treatment in all respects
aside from the treatmen
The normal distribution is very
common in nature
Normal distribution
0.4
f ( x) =
1
2!"
2
e
#
( x # )
2"
2
2
0.3
0.2
Human body temperature
0.1
-2
-1
0
1
Measurement
A normal distribution is fully
des
Two common descriptions of
data
! Location (or central tendency)
Describing data
! Width (or spread)
Measures of location
Mean
Median
Mode
Mean
n
!Y
i
Y=
i=1
n
n is the size of the sample!
Mean
Median
Sample size 10 from Normal distribution with =13 and !2=16
Estimating with uncertainty
Chapter 4
Frequency!
2
1.5
1
0.5
5
10
_
25
X = 13.5
s 2 = 12.1
X
2
2
1.5
1
0.5
5
_
20
A third sample of 10 from t
Two common descriptions of
data
Location (or central tendency)
Describing data
Width (or spread)
Measures of location
Mean
Median
Mode
Mean
n
!Y
i
Y=
i=1
n
n is the size of the sample!
Mean
Median
Y
Final exam
One variable: Which test?
2.5 hours allotted
Chapters 1-17, 19, 20
Excluding section 9.3, 9.6,14.7, 15.3,
15.6, confidence interval for r
Bring: Calculator (not programmable),
pen or pe
Some test are designed to contrast the relationship between two or more variables
Others are for single variable
1) Am I looking at one variable, or the relationship between two or more variables?
2)
Biology 300 Review Session
Pay attention to knowing what test to use in what situation make flashcards, and
understand what each test is, what it does, when its used, what the assumptions are, etc.
Bi
KEY: MID-TERM BIOL 300: October 2010
For all statistical tests, make sure that you clearly state your
hypotheses. Unless otherwise stated, assume = 0.05. Show your
work. Be as precise as possible abou
Name:
TAs name:
Student number:
MID-TERM BIOL 300: October 2009
For all statistical tests, make sure that you clearly state your
hypotheses. Unless otherwise stated, assume = 0.05. Show your
work. Be
MID-TERM BIOL 300: October 2008
For all statistical tests, make sure that you clearly state your
hypotheses. Unless otherwise stated, assume = 0.05. Show your
work. Be as precise as possible about P-v
MID-TERM BIOL 300: October 2007
For all statistical tests, make sure that you clearly state your hypotheses. Unless otherwise
stated, assume = 0.05. Show your work. Be as precise as possible about P-v
Sample size 10 from Normal distribution with =13 and !2=16
Estimating with uncertainty
Chapter 4
Frequency!
2
1.5
1
0.5
5
10
_
25
X = 13.5
s 2 = 12.1
X
2
2
1.5
1
0.5
5
_
20
A third sample of 10 from t
Comparing means
Paired vs. 2 sample
comparisons
! Tests with one categorical and one
numerical variable
! Goal: to compare the mean of a
numerical variable for different groups.
Paired comparisons all
Regression
Correlation vs. regression
! Predicts Y from X
! Linear regression assumes that the
relationship between X and Y can be
described by a line
Regression assumes.
! Random sample
! Y is normal
Publication bias
Researcher and statistician
error
Papers are more likely to be published if P<0.05
~8% of
biomedical
papers have
substantial
statistical
flaws
This causes a bias in the science report
Analysis of variance (ANOVA)
Comparing the means of more
than two groups
Null hypothesis for simple
ANOVA
1 = 2 = 3
! H0 : Variance among groups = 0
X1
OR
X2
X3
Not all 's equal
HA: at least one
popul
Writing a Lab Report
Format
Include a descriptive title
Times New Roman; 12 point font
Double spaced
Figures need to be legible (dont make them too small)
Dont go over the page limit (which inclu
N
= 67.4
Inference about means
! = 3.9
Because Y is normally distributed, we can convert
its distribution to a standard normal distribution:
Y is normally distributed
Y = = 67.4
!Y =
whenever:
Y is n
Final exam
2.5 hours allotted
Chapters 1-17, 19
Excluding section 9.2, 9.5,14.7, 15.3,
15.6, confidence interval for r
Bring: Calculator (not programmable),
pen or pencil, UBC ID
You will be give
Assumptions of t-tests
! Random sample(s)
! Populations are normally distributed
! (for 2-sample t) Populations have equal
variances
Detecting deviations from
normality: by histogram
Frequency
Biomass
Proportions
Example:
2092 adult passengers on the
Titanic;
654 survived
Proportion of survivors = 654/2092
! 0.3
A proportion is the fraction of individuals
having a particular attribute.
Probability
Sir Francis Galton
The history of statistics has its
roots in biology
Inventor of fingerprints,
study of heredity of quantitative traits
Regression & correlation
Karl Pearson
PolymathStudied genetics
Hypothesis testing
Hypothesis testing asks how unusual it is to
get data that differHypothesisthe nullnutshell
from testing in a hypothesis.
We want to know something
about this population, say, are
I
Discrete distribution
Fitting probability models to
frequency data
A probability distribution describing a
discrete numerical random variable
For example,
! Number of heads from 10 flips of a coin
! N