This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: “Numerical quantities focus on expected values, graphical summaries on unexpected values.” –John Tukey Chapter 2: Summarizing Data Preview: • Displaying stationary data distributions o Bar charts, needle plots, frequency histograms o Analysis of same o Causes of common patterns • Summary measures for stationary data distributions and when each is appropriate • Boxplots and outliers • Resistant summary measures What’s the IDEA ? The graphs and measures presented in this chapter are meant to summarize the pattern of variation of data from stationary processes, and may not make sense in other settings. Therefore, data should always be checked for stationarity before using them. • Variable: Name of what is being counted, measured or observed. • Variable Types o Quantitative/Categorical o Discrete/Continuous Displaying Data Distributions Example: A company manufactures knobs for appliances (washing machines, dishwashers, etc.). In one step of the manufacturing process for a certain knob, the spindle diameter, having a nominal value of 3mm, is measured. If it is too small (below 2.9mm), the part is rejected. If it is too large (above 3.1mm), the part is sent back for reworking. Otherwise, it is accepted. Here are data on 12 parts: Displaying Data Distributions Diameter Action 1 2.8 X 2.9 A 3.2 R 3.0 A 3.0 A 3.0 A 2.9 A 2.7 X 2.9 A 3.2 R 3.3 R 3.1 A 1 Accept (A), Rework (R), Reject (X). Displaying Data Distributions In these data, the variable diameter is quantitative and continuous, action is categorical. Suppose we create a new variable, count , which counts how many of the knobs are accepted (7), reworked (3), or rejected (2). Then count is quantitative and discrete. Displaying Data Distributions Use a Bar chart for categorical data. Figure 1 shows a bar chart of action. A R X a c t i o n 2 4 6 F r e q u e n c y Figure: 1: Bar chart of action. Displaying Data Distributions You can use a Needle Plot for a small number (say 20 observations or fewer) quantitative data. Figure 2 shows a needle plot of the diameters of the 12 knobs. Figure: 2: Needle plot of knob spindle diameters. Displaying Data Distributions Use a Frequency Histogram for a larger number of quantitative data. Figure 3 shows a frequency histogram of the heights in cm of 105 high school students. 1 4 0 1 4 8 1 5 6 1 6 4 1 7 2 1 8 0 1 8 8 1 9 6 2 0 4 H E I G H T 1 0 2 0 3 0 F r e q u e n c y Figure: 3: Frequency histogram of the heights in cm of 105 high school students. Analyzing Frequency Histograms Different choices of interval locations and widths can make frequency histograms for the same set of data look very different. Have a look at the following two histograms based on the same data: Analyzing Frequency Histograms & ¡ & ¢ £ & ¤ ¥ ¤ ¤ ¦ ¡ ¦ ¢ ¢ & § ¥ § ¤ ¨ © ª « ¬ ® ¯ ¥ ¡ ¥ & ¥ ° ± ² ³ ´ ² µ ¶ · ¸ ¦ ¹ £ £ ¡ ¹ £ ¤ ¦ ¹ £ ¢ ¡ ¹ £ § ¦ ¹ £ ¨ © ª « ¬ ® ¯ ¥ £ ¥ º ¥ ¥ ° ± ² ³ ´ ² µ ¶ · Analyzing Frequency Histograms...
View
Full Document
 Spring '08
 Petrucelli
 Histograms, Mean, summary measures

Click to edit the document details