EMSE 171/271: DATA ANALYSIS For Engineers and Scientists Session 1: Exploratory Data Analysis, Probability Calculus, Random Variables Lecture Notes by: J. René van Dorp 1 www.seas.gwu.edu/~dorpjr 1 Department of Engineering Management and Systems Egineering, School of Engineering and Applied Science, The George Washington University, 1776 G Street, N.W. Suite 110, Washington ß D.C. 20052. E-mail: [email protected]

EMSE 171/271 - SPRING 2010 J.R. van Dorp-10/12/05; ; Page 2 [email protected] STATISTICAL REVIEW Exploratory Data Analysis Example 1: The tragedy that befell the space shuttle and its astronauts in 1986 led to a Challenger number of studies to investigate the reasons for mission failure. Attention quickly focused on the behavior of the rocket engine's O-rings. Here is the data consisting observations on = O-rign temperature ( for each test firing or actual launch BJ Ñ of the shuttle rocket engine ( Presidential Commission on the Space Shuttle Challenger Accident, ). Vol. 1, 1986: 129-131 84 49 61 40 83 67 45 66 70 69 80 58 68 60 67 72 73 70 57 63 70 78 52 67 53 67 75 61 70 81 76 79 75 76 58 31 Without any organization, it is difficult to get a sense of what a typical or representative temperature might be, whether values are highly concentrated about a typical value or quite spread out, whether there are any gaps in the data, what percentages of the values are in the 60's and so on.
EMSE 171/271 - SPRING 2010 J.R. van Dorp-10/12/05; ; Page 3 [email protected] STATISTICAL REVIEW Exploratory Data Analysis A MINITAB stem-and-leaf display 1 3 1 1 3 2 4 0 4 4 59 6 5 23 9 5 788 13 6 0113 (7) 6 6777789 16 7 000023 10 7 556689 4 8 0134 STEM LEAVES COUNTS Contains median cumulative cumulative Gives a feel of the distribution shape without loss of data Reasonable breakpoints in units of tens, some modifications are possible

EMSE 171/271 - SPRING 2010 J.R. van Dorp-10/12/05; ; Page 4 [email protected] STATISTICAL REVIEW Exploratory Data Analysis A MINITAB Histogram with 10 cells of equal width x Frequency 84 72 60 48 36 7 6 5 4 3 2 1 0 Histogram of O-Ring Temperature Data are put into cells and the frequency of each cell is displayed graphically as a rectangle about the midpoint of the cell. Cell definitions and their number are at the discretion of the modeler with the rule of thumb that number of cells number of observations ¸
EMSE 171/271 - SPRING 2010 J.R. van Dorp-10/12/05; ; Page 5 [email protected] STATISTICAL REVIEW Exploratory Data Analysis Width of cells need not be of the same size (but usually are, since it is the best procedure for distribution representation). Cell definitions may change histogram shape. A MINITAB Histogram with 8 cells of equal width x Frequency 88 80 72 64 56 48 40 32 10 8 6 4 2 0 Histogram of O-Ring Temperature

EMSE 171/271 - SPRING 2010 J.R. van Dorp-10/12/05; ; Page 6 [email protected] STATISTICAL REVIEW Exploratory Data Analysis With enough data, a histogram approximates distributional forms.
