Stat 305 Week 8, Monday. October 24, 2016
(Pearson) Correlation and regression.
Hypothesis testing for correlation and regression
 Installing R and inputting data.
 Different tools for R: Notepad+ and RStudio.
 Basic commands: mean(), sd(), t.test()
Five lectures hours (+2 recorded) remain.
Left to do:
 Logistic regression with multiple predictors (bird houses!)
 Survival analysis
KaplanMeier method
Coxpoportion hazard test
Logrank test
(if time) Timecensoring
 Practice final
Week 3 Supplemental: The Odds.
Odds
Odds are a lot like probability, but are calculated differently.
Times event occurs
Probability of event =
Times anything occurs
Stat 305 PRACTICE Midterm 2
PRACTICE Problem 1: Consider the following regression output, which comes from data of
100 local chess players of near average skill.
Chess_Skill: The skill of the the player, in Elo rating.
Games_Played: The number of registered games played
Stat 305 Midterm 2
Wednesday, November 16, 2016
You have exactly 50 minutes to complete this exam.
This test has 7 pages including this one, and a table.
Stat 305 Midterm 1  PRACTICE MIDTERM
Wednesday, October 12, 2016
You have exactly 50 minutes to complete this exam.
This test has 6 pages including this one, and tables.
Stat 305, Fall 2016 Assignment 3 KEY
Question 19.6, as is.
Question 19.7,
Question 19.8.
19.8a, 19.8g
19.8b
Holding apgar score constant, the systolic blood pressure of infants increases by 1.1848 for
each additional unit
Final exam list of priorities
Midterm 1 (20%)

probability: A or B, A and B, Bayes Law
Sensitivity vs specificity, ROC curve
Midterm 2 (40%)

Regression
o Correlation
o Simple regression
transformation
o Multiple regression
Interactions, polynomials,
Stat 305 Practice Final
Problem 1
Consider the multiple linear regression outlined in this R output.
The response is birth weight in grams (BWT), the explanatory variables are age in years
(AGE), smoking status (SMOKE, 2 categories), weight in pounds of the mother
Stat 305, Fall 2016 Assignment 2,
Due in the drop box near the Stats Workshop at 4:30pm on November 3, 2016.
! SEE THE R CODE AT THE END OF THIS ASSIGNMENT !
All of these questions either come directly from Chapters 18 of Principles of Biostatistics 2nd Edition
STAT 305 Assignment 4 Solution
Due: Nov 18, 16:30, 2015
Note: There are totally 2 big questions. No more update.
Updated on Nov 11, 2015.
1. (2 marks) Short answers:
(a) (1 mark) Briey, what does a positive association between two quantitative
variables mean?
Stat 305 Week 9 Monday
Multiple regression and rsquared.
Multiple regression: colinearity, perturbations,
correlation matrix
Stat 302 Notes. Week 7, Hour 1, Page 1 / 59
First observe this data from the 201112 National Hockey
League season.
We are going
Stat 305 Week 9 Wednesday
How to get a confidence interval of a slope.
Variance inflation factors
Polynomial fits
Stat 305 Notes. Week 9 Wednesday, Page 1 / 43
How to get a confidence interval of a slope.
Consider this output again, from the blood pressure example
Stat 305 Week 7, Hour 1. October 18, 2016
Correlation vs association,
Pearsons R,
nonlinearity,
Spearman rank correlation,
Stat 305 Notes. Week 7, Hour 1, Page 1 / 39
Correlation vs association
Association refers to any sort of trend between between any
With linear we regression, we make the assumption that Y
increases or decreases with X, but does so at the same rate
for every X.
Stat 302 Notes. Week 7, Hour 3, Page 1 / 13
This is not always the case. Sometimes a transformation, like
the log transform can help.
STAT 305 Assignment 3 Solution
1. (Ch. 15, # 10; 4 marks)
In a survey conducted in Italy, physicians with dierent specialities were questioned regarding the surgical treatment of early breast cancer. In particular,
they were asked whether they would recommend surgery
STAT 305 Assignment 2 Solution
To TAs: Its OK if students have slightly dierent results because of dierent
rounding.
1. (9 marks) A study was conducted to investigate the relationship between
maternal smoking during pregnancy and the presence of congenital malformations
Stat 305 Week 4, Hour 3. September 28, 2016
Calculating oddsratio, the crisscross method.
Yates' Continuity Correction, and Fisher's Exact Test
Odds ratio for 3 or more conditions. (2 x N tables)
Stat 305 Notes. Week 4, Hour 3, Page 1 / 36
Consider this
Stat 305 Week 11
How NOT to handle binary responses variables.
Odds, Log Odds, and the Logit Function
Examples of LOGISTIC REGRESSION
Logistic regression with a dummy variable
Logistic regression with a numeric variable
Stat 305 Notes. Week 10, Page 1 / 7
Graph 1. Children per woman (total fertility) vs. Income per person (GDP/capita, PPP$ inflation adjusted)
for 1960 2015.
From the graph showed in the gap minder, there is a strong negative relationship between the two variables.
Breaking this down.
Intercept: Response when all explanatory variables = 0
(including dummy variables).
Therefore,
Intercept: Response at the baseline situation (ALL the
baseline situations).
Stat 305 Notes. Week 12, Page 1 / 24
Response is logodds so.
Chapter l9 Multiple Regression
planatory variables, how would you decide which variables to include in a multiple regression model and which to leave out?
planatmy variables, how would you decide which variables to inClude in a mu:
ple regression model and winch to leave out?
6. How
Week 13, Wednesday
Left to do:
 Some clarifications
 Survival analysis
KaplanMeier method, confidence intervals
Coxpoportion hazard test (Monday)
Logrank test
 Practice final (Monday? and recorded)
Stat 305 Notes. Week 13 Wednesday, Page 1 / 34
Some