Statistical Data Mining ORIE 474 Fall 2007 Tatiyana V. Apanasovich 09/10/07 Data Analysis & Sampling Methods

4.2 Uncertainty Frequentist view: Objective concept Probability of event = Limiting proportion of times that the event would occur in repetitions of essentially identical situations Relative restrictive concept, i.e. not applicable to one-time events Bayesian view: Subjective concept Explicit characterization of all forms of uncertainty in a data analysis, incl. uncertainty about parameters, model structures, forecasts etc. Very flexible concept The calculus is the same for both concepts but the underlying interpretation and the conclusions drawn differ
4.3 Random Variables Random Variables X Mapping from a property of objects to a variable that can take on one of a set of possible values Mapping process appears to the observer to have some element of unpredictability to it Ex: coin toss Possible values for X: domain of X Review appendix of textbook for basic properties

Multivariate Random Variables X=(X 1 ,…,X p ) set of random variables Joint density function: f(x)=f(X 1 =x 1 ,…,X p =x p )= f(x 1 ,…,x p ) Marginal density f(x 1 ) Conditional density f(x 1 |x 2 ) X and Y are independent iff f(x|y)=f(x) X is conditionally independent of Y given Z iff p(x,y|z) = p(x|z)p(y|z) First-order Markov property
4.4. Samples & Statistical Inference Statistical Inference Probability Model Data

Statistical Inference On the basis of a sample: Makes statements about the population structure Estimates the size of these structures Expresses our confidence in these estimates Must have some structure in mind
A. Desirable Properties of Estimators

