Midterm Review, parts I and II
[The reviews are not necessarily indicative of the material on the midterm and are meant to be used solely as guides
once you have read the book, done the homework problems and labs. The midterm will inevitably contain a
different proportion of coverage of topics, as well as different material than presented here. However, we believe
that you may find this review helpful. Homework problems are the most useful preparation for your exams.]
Statistics
are used to make inferences about populations based on a sample of that population.
The sample must be random and independent in order to assume that it is representative of the
population.
Random
: each possible sample of a given size that could be drawn from the population has the
same probability of being drawn
Independent:
choice of one individual for the sample does not affect the probability of another
being chosen
Descriptive statistics
:
Mean
(arithmetic):
μ
(population mean), x (sample mean); the average
Σ
x
i
Σ
F
i
x
i
x
=
or, for data in tabular form, x =
n
Σ
F
i
Median
(M): the value in the middle, below and above which 50% of observations lie
Mode:
the most frequently occurring value in a dataset
Range:
difference between the largest and smallest items in a sample
Sum of squares (SS):
square of the deviates from the mean of a dataset
SS =
Σ
(x – x)
2
Variance
: the
average
square deviation from the mean.
Population variance is designated
σ
2
;
the sample variance is s
2
.
Σ
(x
i
– x)
2
Σ
F
i
(x
i
– x)
2
s
2
=
or, for data in tabular form,
n1
Σ
F
i
1
Standard deviation:
σ
(population), s (sample)
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentΣ
(x
i
– x)
2
s =
n1
Parameters vs. statistics
:
Parameters are the true values for the population (e.g.
μ,σ
2
,σ
).
Statistics are the estimated values based on the sample (e.g. x, s2, s) and will vary from sample to
sample.
Probability
: the true relative frequency of an occurrence (given an infinite number of trials)
Probability of mutually exclusive events occurring in the same experiment:
P(A or B) = P(A) + P(B)
Probability of independent events occurring in different experiments:
P(A and B) = P(A) * P(B)
Pvalue:
the probability of randomly obtaining a value as extreme or more extreme as the one
observed in a data set. “Extreme” can be on the large side, small side, or both, depending on
whether one conducts a onetailed or a twotailed test.
2
goodness of fit
: tests how well a set of observations conforms to a theoretical distribution.
1
sample test (i.e. one variable tested)
H
0
:
data are drawn from a given distribution (e.g. uniform, binomial, Poisson)
Degrees of freedom:
(# categories –1) – (# parameters estimated from data)
( O – E)
2
χ
2
=
Σ
E
where
O is the observed value, and E is the expected value
2
contingency table
: tests for associations between variables
H
0
: categories are independent
Expected values for a particular cell in the table are calculated as:
(column total* row total)/ grand total
This is the end of the preview.
Sign up
to
access the rest of the document.
 Winter '04
 Evans
 Normal Distribution, Standard Deviation, Null hypothesis, Yates

Click to edit the document details