Introduction to Probability and Statistics in Biology and Public Health
Fall 2015
In his 1954 classic, How to Lie with Statistics, Darrell Huff and the illustrator Irving Geis use the
following humorous sketch to illustrate how averages can be misleading summaries with
skewed distributions:
These are the
Spring 2013
Analysis of Variance
This may be difficult but try your best. We will go over it in class. Thanks.
1. (I know this problem is annoying, but I think it is important you try at least one of
these by hand so you know what i
Spring 2013
HOMEWORK # 2
1. A medical researcher in India obtained blood samples from 31 young children all
of whom were infected with malaria. The following data, listed in increasing
order, and the number of malarial parasites found in 1 m
Spring 2013
PAIRED ANALYSES AND MANNWHITNEY UTEST
1. Here are the data for an in class experiment. The simple idea was the eating a
banana while studying may improve your shortterm memory. In our experiment,
the memory test
CATEGORICAL DATA ANALYSIS
1. As part of a study of environmental influences on sex determination in the fish
Menidi, eggs from a single mating were divided into two groups and raised in either a
warm or a cold environment. It w
PROBLEM SET # 4
1. (Edition 3, 5.4) A new treatment for AIDS is to be tested in a clinical trial on 15
patients. The proportion p hat who respond to the treatment will be used as an
estimate of the proportion p of potential respo
REGRESSION AND CORRELATION
1. (Edition 3, 12.8, p. 538) The rowan is a tree that grows in a wide range of
altitudes. To study how the tree adapts to its varying habitats, researchers
collected twigs with attached buds from 12 tr
Problems and Answers for Chapter 5
Introduction to Hypothesis Testing
1. Explain why most researchers are more comfortable rejecting H0 than accepting it. Use a
probability argument in your explanation.
1. When H0 is rejected, the probability of making a
NAME: _
DATE: _
FINAL EXAMINATION
1.
Ellen Davis Jones studied reminiscence therapy for older women with
depression. She studied 15 women 60 years or older residing for three months or
longer in an assisted living longterm care
PROBLEM SET # 5
1. In evaluating forage crop, it is important to measure the concentration of various
constituents of plant tissue. In a study of the reliability of such measurements, a
batch of alfalfa was dried, ground, and pas
Hour Exam
1. The following are data on the 15 weights (in lbs) of trout caught in
Geneva Lakes trout derby in 1994.
Weights
2.26 3.57 7.86 2.45 1.85 3.88 4.60 4.90 3.60 3.89 2.14
1.52 2.83 1.84 2.12
A. Without using
PROBLEM SET # 6
1. Some soap manufacturers sell special antibacterial soaps. However, one might
expect ordinary soap to also kill bacteria. To investigate this, a researcher
prepared a solution from an ordinary, nonantibiotic so
Biostatistics Spring 2013
Answers to Practice Midterm
1. Q1 = 2.12, Q3 = 3.89, median = 2.83, IQR = 1.77, 7.89 is an outlier because it is
more than 1.5 * IQR above Q3.
2. A. Pr (Type A) x Pr (Rh+) x Pr (Type MN) = 0.30 x 0.85 x 0.2 = 0.051.
B. Since trai
Statistics 13 Homework 3
Chapter 4 Homework Solutions [9] The serum cholesterol levels of 17yearolds follows a normal distribution with mean 176 mg/dLi and standard deviation 30 mg/dLi. What percentage of 17yearolds have serum cholesterol values: [9a]
Math 265 Homework 8  DUE Thursday April 15th
1. If a woman takes an early pregnacy test, she wll either test postive, meaning that the test says she is
pregnant, or test negative, meaning that the test says that she is not pregnant. Suppose that is a wom
Pearson correlation coefficient r and the slope b
What does correlation tell us?
As Baldi and Moore say on page 74:
A linear relationship is strong if the points lie close to a straight line,
and weak if they are widely scattered about a line.
And they ad
outline, part 2
Wednesday March 2
recap: distribution of X values vs. distribution of
Central Limit Theorem
problems using the normal curve
introduction to confidence interval for a mean
assumptions for using the t distribution
assessing the dist
always plot the data
When the calculations for regression and correlation had to be done by hand, or with the aid of
very simple calculators, investigators (or their graduate students) became very familiar with the
values of each observation and the overa
using the fitted values from a regression
to estimate the mean outcome for the population with a given X
and to predict the outcome for an individual with a given X;
confidence intervals for estimation and prediction
The reason these questions can be tric
fitting a line and the principle of least squares
How shall we use the data to estimate the coefficients in the regression model, and ?
Since the purpose of the model is to understand
how the mean of Y varies as a function of X
The usual criteria for choo