# stat609-34 - Chapter 6 Principles of Data Reduction Lecture...

• Notes
• 17

This preview shows page 1 - 6 out of 17 pages.

beamer-tu-log Chapter 6. Principles of Data Reduction Lecture 34: Sufficiency Data reduction We consider a sample X = ( X 1 ,..., X n ) , n > 1, from a population of interest (each X i may be a vector and X may not be a random sample, although most of the time we consider a random sample). Assume the population is indexed by θ , an unknown parameter vector. Let X be the range of X Let x be an observed data set, a realization X . We want to use the information about θ contained in x . The whole x may be hard to interpret, and hence we summarize the information by using a few key features (statistics). For example, the sample mean, sample variance, the largest and smallest order statistics. Let T ( X ) be a statistic. For T , if x negationslash = y but T ( x ) = T ( y ) , then x and y provides the same information and can be treated as the same. UW-Madison (Statistics) Stat 609 Lecture 34 2014 1 / 10
beamer-tu-log T partitions X into sets A t = { x : T ( x ) = t } , t T (the range of T ) All points in A t are treated the same if we are interested in T only. Thus, T provides a data reduction. We wish to reduce data as much as we can, but not lose any information about θ (or at least important information). Sufficiency A sufficient statistic for θ is a statistic that captures all the information about θ contained in the sample. Formally we have the following definition. Definition 6.2.1 (sufficiency) A statistic T ( X ) is sufficient for θ if the conditional distribution of X given T ( X ) = T ( x ) does not depend on θ . The sufficiency depends on the parameter of interest. UW-Madison (Statistics) Stat 609 Lecture 34 2014 2 / 10
beamer-tu-log Sufficiency A sufficient statistic for θ is a statistic that captures all the information about θ contained in the sample. Formally we have the following definition. Definition 6.2.1 (sufficiency) A statistic T ( X ) is sufficient for θ if the conditional distribution of X given T ( X ) = T ( x ) does not depend on θ . The sufficiency depends on the parameter of interest. UW-Madison (Statistics) Stat 609 Lecture 34 2014 2 / 10
UW-Madison (Statistics) Stat 609 Lecture 34 2014 3 / 10
beamer-tu-log To show T is sufficient for θ