ISOM 111 L7-L8, Fall 2010
Homework 1 Solutions
. An insurance agency is examining the dollar amount of claims from clients who have home-
owners insurance. For the 900 people who ﬁled claims, the ﬁve-number summary of the amount is:
($8800, $8850, $8900, $9100, $9940).
(a) Would the histogram displaying the data for the 900 claims be nearly bell-shaped? If so, explain
how the summary indicates this. If not, determine if the data is skewed left or skewed right, and
explain how the summary indicates this.
: Skewed to the right. The median is much closer to the 1st quartile than it is to the 3rd
quartile, and the maximum value is much farther away from the median than the minimum value.
(b) Would the boxplot for the data indicate any outliers? Explain why or why not.
: IQR=9100-8850=250. 1
The inner fences will be 8850-375=8475 and
9100+375=9475 respectively. With a maximum of 9940, there is at least one high outlier. (There
will be no low outliers because the minimum is 8800, which is within the inner fences.)
A drug manufacturer has hundreds of sales representatives all over the United States. A
histogram for yearly sales totals for each representative is roughly bell shaped and symmetric except
for 4 high outliers corresponding to representatives in Boston, MA. Their sales totals are at least
$60,000 greater than the next highest total. One analyst suggests dropping these 4 totals from the
data to get a better summary of the sales across all regions of the country.
(a) If the outliers were to be dropped, which measure of central tendency of the data set would be
aﬀected the most – the mean, the median, or the mode? Explain why.
: The median is based on the order of the data, so dropping the high values will most likely
not have much eﬀect on this measure. The mode is the most frequent data value, and it is unlikely
that any of these 4 outliers would represent the mode. The mean depends on the size of the data
points. It is pulled to the right away from the median by the high values. Eliminating these values
will bring it back in line with the median, so it is going to be the value that is aﬀected the most.
(b) The high outliers are dropped from the data and the mean is determined to be $70,000 with
a standard deviation of $8,000. For future analysis, management would like to be able to identify
sales amounts that are “unusually low”, which they deﬁned as being among the lowest 2.5% of
all sales amounts. Using the Empirical Rule for this data, what amount should be considered the
cut-oﬀ for sales amounts being classiﬁed as unusually low?
: By the empirical rule, roughly 2.5% of the amounts will fall below the value: Mean-2SD, or
in this case: 70000-2*8000=$54,000.