Lecture 11 Notes, Nonparametric Statistics
Does not depend on the population fitting any particular type of distribution
(e.g, normal). Make fewer assumptions and apply more broadly at the
expense of a less powerful test (needing more observations to draw
Lecture 10 Notes, Hypothesis Testing
A (parametric) hypothesis is a statement about one or more population
parameters. This hypothesis can be tested using a hypothesis test.
A hypothesis test consists of:
1. Two comple
Lecture 9 Notes, Intervals
1 Interval Estimation
Interval estimation is another approach for estimating a parameter . Interval estimation
consists in finding a random interval that contains the true parameter with probability (1
). Such an interval is ca
Lecture 8 Notes, Estimation
A pmf/pdf can be equivalently written as fX (x) or fX (x|), where represents the
constants that fully define the distribution. For example, if X is a normally distributed RV,
the constants and will f
Lecture 2 Notes, Discrete and Random Variables
A random variable is a variable with unknown numerical value that can take on, or
represent, any possible element from a sample space. The elements of a sample
Lecture 6 Notes, Normal Distribution
1 Discrete Distributions
Let the RV X be the total number of successes in a sample of n elements drawn
from a population of N elements with a total number of M successes. Then, the
Lecture 7 Notes, Random Sampling
Let X1., Xn be mutually independent RVs such that fXi (x) = fXj (x) i = j. Denote fXi (x) = f(x). Then, the collection
X1, ., Xn is called a random sample of size n from the population f(x).
Lecture 5 Notes, Vectors
1 Function of a Random Variable (Univariate Model)
Let X be a discrete random variable with pmf fX (x). Define a new random variable Y
as a function of X, Y = r(X). The pmf of Y , fY (y), is derived as follows:
Lecture 1 Notes, Set and Probability
Definitions and Theorems
1. Experiment: any action or process whose outcome is subject to uncertainty.
2. Sample Space: collection of all possible outcomes (or elements) of the
experiment (set S). [Fin
Lecture 4 Notes, Expectations
1 Expected Value
1.1 Univariate Model
Let X be a RV with pmf/pdf f (x). The expected or mean value of X, denoted E(X)
or X , is defined as:
E(X) = X =
E(X) = X =
Lecture 3 Notes, Multivariable Distribution
1 Multiple Random Variables
Many experiments deal with more than one source of uncertainty. For these cases a random
vector must be defined to contain the multiple random variables we
Lecture 10 Notes, Regression
Regression analysis allows us to estimate the relationship of a response
variable to a set of predictor variables
x1, x2, xn
be settings of x chosen by the investigator and
y1, y2, yn
be the corresponding values of the res
Lecture 2 Notes, Data
A population is a collection of objects, items, humans/animals (units) about
which information is sought.
A sample is a part of the population that is observed.
A parameter is a numerical characteristic
Lecture 9 Notes, Two-Sample Inference
Independent Samples Design:
There are a few dierent ways we can do an experiment. In an independent samples design,
we have an independent sample from each population. The data from the two groups are independent.
Lecture 6 Notes, Inference
Statistical Inference is the process of making conclusions using data that is subject to random variation.
Bias() := E() , where is the true parameter value and is an estimate of it
computed from data.
Mean-Squared Error (MSE)
Lecture 4 Notes, Central Limit
Let X1, X2, . . . , Xn be a random sample drawn from any distribution with a finite mean and
variance . As n , the distribution of:
converges to the distribution N(0, 1). In other words,
Note 1: What
Lecture 5 Notes, Confidence Intevals
Instead of reporting a point estimator, that is, a single value, we want to report a
confidence interval [L, U] where:
P cfw_L U = 1 ,
the probability of the true value being within [L, U] is pretty large.
Here, [L, U]
Lecture 1 Notes, Probability
A probability space, defined by Kolmogorov (1903-1987) consists of:
A set of outcomes S, e.g.,
for the roll of a die, S = cfw_1, 2, 3, 4, 5, 6,
for the roll of two dice, S =
1 , 1 , 2 , 1 ,., 6
temperature on Monday
Lecture 8 Notes, Single Sample Inference
You know already for a large sample, you can invoke the CLT so:
X N(, ).
Also for a large sample, you can replace an unknown by s.
know how to do a hypothesis test for the mean, either:
calculate z-statistic a
Lecture 3 Notes, Numerical Data
i =1 i
Sample median: order the data values x(1) x(2) x(n), so then
median := x := 1
+ x( +1)]
Mean and median can be very dierent: 1, 2, 3, 4, 500 .