The interquartile range IQR is the distance between the first and third quartiles, IQR = Q3 − Q1 Call an observation a suspected outlier if it falls more than 1.5 × IQR above the third quartile or below the first quartile. Standard deviation: Measure of the spread about the mean of a distribution. It is an average of the squares of the deviations of the observations from their mean, also equal to the square root of the variance. Variance: Measure of the spread about the mean of a distribution. Equal to the standard deviation squared. The number n – 1 is called the degrees of freedom of the variance or standard deviation. * s measures spread about the mean and should be used only when the mean is chosen as the measure of center. * s is always zero or greater than zero. s = 0 only when there is no spread. This happens only when all observations have the same value. Otherwise, s > 0. As the observations become more spread out about their mean, s gets larger. * s has the same units of measurement as the original observations. For example, if you
The five-number summary is usually better than the mean and standard deviation for describing a skewed distribution or a distribution with strong outliers. Use and s only for reasonably symmetric distributions that are free of outliers. A graph gives the best overall picture of a distribution. Numerical measures of center and spread report specific facts about a distribution, but they do not describe its entire shape. Numerical summaries do not disclose the presence of multiple peaks or clusters. This symbol for population standard deviation is (o ? ? ? ? ? ) which is pronounced "sigma". "s" is the symbol for sample standard deviation.
