STAT 225 Lecture 4 Notes Sample Statistics, Box plots, and Describing the shape of distributions Motivation: As previously mentioned often times a researcher is interested in studying some underlying population of interest. In order to learn something about this population a random sample is drawn. If the sampling is conducted in the correct manner information obtained from it can be extended to the whole population. Although very useful as a first step in their analysis the researcher can only get so far by looking at plots such as histograms for numeric variables and bar charts for categorical variables. When working with numeric variables the researcher will often want to estimate several key parameters which describe certain characteristics of the population. The next step in the analysis is then to calculate several statistics which describe the observed sample. Often times these statistics serve as very good estimates for the underlying parameters of interest. Def n : A parameter is a number which describes a population. This number is fixed and almost always unknown. Def n : A statistic is any function of the observed data from a study or experiment. Since the value of a statistic is not known until the random sample is observed, it is itself random, i.e. a statistic is random due to the fact that it depends on which random sample is observed. Measures of Location Def n : Every reasonable underlying population will have some well-defined central value which we will refer to as the population mean, typically denoted by μ. Def n : The sample mean : 1 n i i x x n o Where n = the total # of observations o x i denotes the ith individual sample observation, hence i = 1, …, n.

