{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

sampdist - Stat 411 Lecture Notes Statistics and sampling...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Stat 411 – Lecture Notes Statistics and sampling distributions *†‡ Ryan Martin Spring 2012 1 Introduction Statistics is closely related to probability theory, but the two fields have entirely different goals. Recall, from Stat 401, that a typical probability problem starts with some as- sumptions about the distribution of a random variable (e.g., that it’s binomial), and the objective is to derive some properties (probabilities, expected values, etc) of said random variable based on the stated assumptions. The statistics problem goes almost completely the other way around. Indeed, in statistics, a sample from a given population is observed, and the goal is to learn something about that population based on the sample. In other words, the goal in statistics is to reason from sample to population, rather than from population to sample as in the case of probability. So while the two things—probability and statistics—are closely related, there is clearly a sharp difference. One could even make the case that a statistics problem is actually more challenging than a probability problem because it requires more than just mathematics to solve. (The point here is that, in a statistics problem, there’s simply too much information missing about the popula- tion to be able to derive the answer via the deductive reasoning of mathematics.) The goal of Stat 411 is to develop the mathematical theory of statistics, mostly building on multivariate calculus and probability theory at the level of Stat 401. To understand the goal a bit better, let’s start with some notation. Let X 1 , . . . , X n be a random sample (independent and identically distributed, iid) from a distribution with cumulative distribution function (CDF) F ( x ). The CDF admits a probability mass function (PMF) p ( x ) in the discrete case and a probability density function (PDF) f ( x ) in the continuous case. One can imagine that p ( x ) or f ( x ) characterizes the population from which X 1 , . . . , X n is sampled from. Typically, there is something about this population that is unknown; otherwise, there’s not much point in sampling from it. For example, if the population in question is of registered voters in Cook county, then one might be interested in the unknown proportion that would vote democrat in the upcoming election. The goal would be to “estimate” this proportion from a sample. But the point here is * Version: January 11, 2012 Please do not distribute these notes without the author’s consent ( [email protected] ) These notes are meant to supplement in-class lectures. The author makes no guarantees that these notes are free of typos or other, more serious errors. 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
that the population/distribution of interest is not completely known. Mathematically, we handle this by introducing a quantity θ , taking values in some Θ R d , d 1, and weakening the initial assumption by saying that the distribution in question has PMF or PDF of the form p θ ( x ) or f θ ( x ) for some θ Θ. That is, the statistician believes that
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}