BIO4158
Shareef Akbari
DECEMBER 21ST, 2016
UNIVERSITY OF OTTAWA
INTRODUCTION I . 1
SLIDES . 1
Types of Studies . 9
Scientific Method . 10
How Every Statistical Test Works . 16
NOTES .

Lecture 3
1) Announcements & feedback
2) Measures of dispersion
3) Gaussian & standard normal distribution
4) Confidence interval of a mean
5) Central limit theorem
1
Announcements
Teaching assistants & office hours
(TA's will be in the 3rd floor lobby b

I dont really understand how the biological question only
relates to a subset of model terms (the interaction
terms) in log linear models. Wouldnt that be biased?
Internal vs. external hypothesis
Internal hypothesis concerns the lack of an interaction
W

Multi-way (aka multi-classification)
ANOVA - part 2
1
Fixed effects
All levels of interest are included; results cannot be
extrapolated beyond these levels
Objective is to make comparisons of the dependent variable
among levels of the factor (i.e. compa

Lecture 6
1. Power
2. Bayesian logic & False Discovery Rate
1
(Some of) Your questions
I don't understand why you would see a small P
value with a large effect observed in tiny samples
Wording in chp. 18 summary is not ideal. P depends on
sample size and

2. A student in BIO4900 fit a simple linear regression and obtained the
diagnostic plots shown below. Should the student be concerned with
potential violations of any assumptions of the model they have fit? If so,
explain which ones and why.
Normal Q-Q
4

Multi-way (aka multi-classification)
ANOVA - part 3
1
Questions
Omnibus tests are two-tailed (non-directional)
If you have directional predictions you deal with these in your
follow-up comparisons
Multi-way ANOVAs can accommodate any # of factors (i.e.

Still dont get the difference between ANOVA and ANCOVA?
Can both the dependent, and independent variables be either continuous or
factoral? Our dependent variable is always continuous (see below) while
independent variables are bit different.
Which mode

Multiple linear regression
Part 2
Model comparison by F-Test
Tests whether the addition (removal) of terms significantly increases
(decreases) model fit
Works via a model comparison just like we discussed for t-tests,
regression & ANOVA
The reduced mod

Lab: Two-sample comparisons
1. Randomization
2. Bootstrap
3. Wilcoxon signed-rank test
4. Tests for equality of variance
5. One vs. two-tailed tests
6. Midterm information
1
Issues with reading
Permutation tests
Bootstrapping
What test to use when
Whats t

One-way analysis of variance (ANOVA) 2
1. Controlling family-wise error rate
2. Follow-up tests and multiple comparisons
3. Some questions
4. ANOVA in R
5. Non-parametric alternatives
1
Control
Experimental (N)
Experimental (P)
Frequency
ANOVA tells us w

Lecture 4
1. Whats a p-value?
2. One vs. two-tailed testing
3. Relationship between CIs and P-values
4. P-hacking
Issues with readings
I still find the definition of p-values unclear
One vs two-tailed P-values (this is different from paired vs. unpaired

Multi-way (aka multi-classification)
ANOVA
1
1. Introduction: Part 1 of the course
Confidence intervals, statistical hypotheses, p
values, effect size, and power
Multiple comparisons
Frequentist vs. Bayesian approach
Effect size and graphical display

Correlation and regression 2
1. CI of estimates
2. Outliers
3. Multiple testing
4. Doing Lab 3: Linear regression in R
5. Testing assumptions
6. Bootstrapping
7. Power in linear regression
1
Main issues with the lab manual reading
Problems with R code
B

Correlation and regression
1
General Difficulties
I don't understand how a correlation is different from
a linear regression.
R2 - What is it and what is it used for?
df in regression
Assumptions in regression/correlation
Least squares
Comparing mod

Assignment 3, Q. 4
Compute the mean warming rate in the North and in the South by
averaging the rates calculated separately by station in Q2 (i.e. not
pooling stations as done in Q3) (1 point). Calculate a 95% CI for
each of these means based on the 3 rep

General Linear Models (GLM)
1
Some (G)LM procedures
*either categorical or treated as a categorical variable.
Note: rather than binning a continuous variable into discrete classes, it is always
preferable to treat it as continuous and fit a regression bec

Analyzing frequencies: Contingency
tables and log-linear models
1
Counts (i.e. absolute frequencies)
Count data are common in biology
Counts are non-negative discrete ratios (often called non-negative integers, 1,
2, 3, 4 etc) and the variance in counts t

Lecture 2
1.
2.
3.
4.
5.
Why do we need statistics (the Scientific method)
Probability, parameters, and statistics
Confidence intervals
Models
Translating biological questions into statistical models and
hypotheses
1
Comments about your feedback
In gener

Multiple linear regression
1
Issues
R2 vs. adjusted R2
Multicollinearity
Model selection: eliminating variables/automatic vs.
manual/model comparison
Chp 38 was not required (check updated syllabus)
Multiple comparisons
Dummy/binary variables
SS
2

Multiple linear regression
Part 2
Regression coefficients
Consider 3 variables: Y, X1 and X2
We want to build a regression model that predicts Y and are
especially interested in its relationship to X1.
Two options:
1) Ignore X2. Model: Y = + 1X1 + error
2

Multi-way (aka multi-classification)
ANOVA - part 2
1
Reading
2
Dubey & Shine (2011)
Dependent variable:
Independent variables:
Type of analysis:
Model:
H0, F, P and inference:
3
Page 1
1
Dubey & Shine (2011)
Error bars are mean 1 SE.
Question: Can they b

Analyzing frequencies: Contingency
tables and log-linear models
1
Counts (i.e. absolute frequencies)
Count data are common in biology
Counts are non-negative discrete ratios (often called non-negative integers)
and the variance in counts tends to increase

Lecture 2
1.
2.
3.
4.
Probability, parameters, and statistics
Confidence intervals
Models
Translating biological questions into statistical models and
hypotheses
1
Comments about your feedback
In general: very good, valuable to me and to you (even if you

Lecture 5
1. Whats a p-value?
2. One vs. two-tailed testing
3. Relationship between CIs and P-values
4. P-hacking
Where we are (in lectures/readings)
The scientific method and the role of statistics in it
Probability, parameters and statistics (and their

Lecture 6
1. Power
2. Bayesian logic & False Discovery Rate
1
(Some of) Your questions from
previous lectures/readings
I don't understand why you would see a small P value with a large effect
observed in tiny samples
Wording in chp. 18 summary is not ide