Comparing Populations
Proportions and means
Most studies will have more than one population.
Example The Salk-vaccine trial 1954
A large study to determine if the Salk vaccine
was effective in reducing the incidence of polio.
Two populations:
1. Individua
Linear Regression
Hypothesis testing and Estimation
Assume that we have collected data on two
variables X and Y. Let
(x1, y1) (x2, y2) (x3, y3) (xn, yn)
denote the pairs of measurements on the on
two variables X and Y for n cases in a sample
(or populatio
Multivariate Data
Descriptive techniques for
Multivariate data
In most research situations data is collected
on more than one variable (usually many
variables)
Graphical Techniques
The scatter plot
The two dimensional Histogram
The Scatter Plot
For two
The following data gives the results on n = 100 performances of the Olympic decathlon. The
measurements taken for each performance were:
1. x1 (100m)
2. x2 (Long jump)
3. x3 (Shot put)
4. x4 (High j
The data for this assignment was collected from n = 1800 women age 20- 45.
The following variables were measured
1. age
2. BP = systolic Blood pressure
3. BrthPl = cfw_yes, no (whether subject
In the following experiment the researcher was interested in the side effects of radiation therapy
on patents suffering cancerous lesions. In particular he was interested in the effect of radiation
therap
1. In the following study the researcher was interested in determining how systolic
and diastolic blood pressure affected heart recovery rate. Subjects (n = 45) in
the experiment exercised on a stationary bike unt
1. In the following study, 20 manuscripts of an unknown author were found. Some of the evidence proved that
these unknown manuscripts were either written by author A or author B. To decide between these two
Students t-test
Recall: The z-test for means
The Test Statistic
x 0 x 0 x 0
z
s
x
n
n
Comments
The sampling distribution of this statistic is the
standard Normal distribution
The replacement of by s leaves this
distribution unchanged only if the sample
Numerical Measures
Measures of Central Tendency (Location)
Measures of Non Central Location
Measure of Variability (Dispersion,
Spread)
Measures of Shape
Measures of Central Tendency
(Location)
Mean
Median
Mode
Central Location
0.14
Comparing k Populations
Means One way Analysis of
Variance (ANOVA)
The F test for comparing k means
Situation
We have k normal populations
Let i and denote the mean and standard
deviation of population i.
i = 1, 2, 3, k.
Note: we assume that the stand
Multivariate Data
Summary
Linear Regression and
Correlation
Pearsons correlation coefficient r.
n
r
S xy
S xx S yy
x x y
i
i
y
i 1
n
n
x x y
2
i
i 1
i 1
i
y
2
Slope and Intercept of the Least Squares line
n
Slope
b
S xy
S xx
x x y
i
i 1
n
i
x x
y
Sampling Theory
Determining the distribution of Sample
statistics
Sampling Theory
sampling distributions
Note:It is important to recognize the dissimilarity
(variability) we should expect to see in various
samples from the same population.
It is importan
Stats 245 Assignment 2 - Solutions
For each of the three drugs A, B and C
a) Compute the mean and the standard deviation.
Drug A
n
x=
x
i =1
i
n
=
209.2
= 10.46 , s =
20
n
xi
n
2
xi i =1
2
2431.72 (209.2 )
2
n
i =1
n 1
=
20 = 3.58
19
Drug B
n
x=
x
i