Outlier_points - Outlier Points Author: John M. Cimbala,...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Outlier Points Author: John M. Cimbala, Penn State University Latest revision: 14 September 2007 Introduction Sometimes in an experiment, some data points appear to be “questionable.” Questionable data points are called outliers data points that are not consistent with the rest of the data . Outliers should never be discarded without proper statistical justification . Here, we discuss statistical methods that help us to know whether to keep or discard suspected outliers. We discuss two types of outliers: (1) outliers in a sample of a single variable ( x ), and (2) outliers in a set of data pairs ( y vs. x ). Outliers in a sample of a single variable Consider a sample of n measurements of a single variable x , i.e., x 1 , x 2 , . .. , x n . It is helpful to arrange the x values in increasing order, so that the outliers are easily spotted (typically either the first or last data point(s) are suspected, since they are the lowest and highest values of x in the sample, respectively). The modified Thompson tau technique is a statistical method for deciding whether to keep or discard suspected outliers in a sample of a single variable. Here is the procedure: o The sample mean x and the sample standard deviation S are calculated in the usual fashion. o For each data point, the absolute value of the deviation is calculated as ii i dx x δ == . o The data point most suspected as a possible outlier is the data point with the maximum value of i . o The value of the modified Thompson τ (Greek letter tau) is looked up in a table as a function of n , the number of points in the sample. A table of the modified Thompson is provided below: o We determine whether to reject or keep this suspected outlier, using the following simple rules: ± If i > S , reject the data point . It is ± If i S , keep the data point . It is not o With the modified Thompson technique, we consider only one suspected outlier at a time – namely, the data point with the largest value of i . If that data point is rejected as an outlier, we remove it and start over. In other words, we calculate a new sample mean and a new sample standard deviation, and search for more outliers. This process is repeated until no more outliers are found.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Example : Given: Ten values of variable x are measured. The data have been arranged in increasing order for convenience: 48.9, 49.2, 49.2, 49.3, 49.3, 49.8, 49.9, 50.1, 50.2, and 50.5.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/05/2008 for the course ME 345 taught by Professor Staff during the Spring '08 term at Pennsylvania State University, University Park.

Page1 / 5

Outlier_points - Outlier Points Author: John M. Cimbala,...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online