when we have a data set that contains an outlier, it is often better to use the median to describe the center, rather thanthe mean.Example DIn 2005, the CEO of Yahoo, Terry Semel, was paid almost \$231,000,000 (see2005/rank.html). This is certainly not typical of what the average worker at Yahoo could expect to make. Instead ofusing the mean salary to describe how Yahoo pays its employees, it would be more appropriate to use the mediansalary of all the employees.You will often see medians used to describe the typical value of houses in a given area, as the presence of a very fewextremely large and expensive homes could make the mean appear misleadingly large.On the WebJava Applets helpful to understand the relationship between the mean and the median: lane/stat_sim/descriptive/index.htmlVocabularyWhen examining a set of data, we usedescriptive statisticsto provide information about where the data are centered:Themodeis a measure of the most frequently occurring number in a data set and is most useful for categorical dataand data measured at the nominal level.Themeanandmedianare two of the most commonly used measures of center.Themean, or average, is the sum of the data points divided by the total number of data points in the set. In a dataset that is a sample from a population, thesample meanis denoted byx. Thepopulation meanis denoted byμ.Themedianis the numeric middle of a data set. If there are an odd number of data points, this middle value is easyto find. If there is an even number of data values, the median is the mean of the middle two values.Anoutlieris a number that has an extreme value when compared with most of the data. The median is resistant.That is, it is not affected by the presence of outliers. The mean is notresistant, and therefore, the median tendsto be a more appropriate measure of center to use in examples that contain outliers.Because the mean is thenumerical balancing point for the data, it is an extremely important measure of center that is the basis for many othercalculations and processes necessary for making useful conclusions about a set of data.45

