# Chapter2 - Chapter 2 Methods for Describing Sets of Data...

Chapter 2 Methods for Describing Sets of Data 2.1 Describing Qualitative Data Definition 2.1 Class: A class is one of the categories into which qualitative data can be classified. Definition 2.2 Class Frequency: The class frequency is the number of observations in the data set falling in a particular class. Definition 2.3 Class Relative Frequency: The class relative frequency (RF) is the class frequency divided by the total number of observations in the data set. That is class relative frequency = class frequency n Definition 2.4 Class Percentage: The class percentage is the relative frequency multiplied by 100. That is Class precentage= class relative frequency × 100 Example 2.1, page 35 . 2.2 Graphical Methods for Describing Quantitative Data Data can be described by graphically and numerically. This section will discuss about the graphical representation of the data. For describing, summarizing and detecting patterns of a data set, one can use the following three graphical methods. 1. Dot plots: The dot plot condenses the data by grouping all values that are same together in the plot. 2. Stem-and-leaf displays (stem plots): The stem-and-leaf plot display condenses the data by grouping all data with the same stem together in the graph. 6

3. Histograms: The histogram condenses the data by grouping similar data values in the same class in the graph. Since most of the statistical software packages can be used to construct these plots, we will focus on their interpretations rather than their constructions. Suppose a financial analyst is interested in the amount of resources spent by computer hardware and software companies on research and development (R&D). She samples 50 of these high-technology firms and calculates the amount each spent last year on R&D as percentage of their total revenue. The results are given (see Table 2.2, page 42) are as follows: 13.5 8.4 10.5 9.0 9.2 9.7 6.6 10.6 10.1 7.1 8.0 7.9 6.8 9.5 8.1 13.5 9.9 6.9 7.5 11.1 8.2 8.0 7.7 7.4 6.5 9.5 8.2 6.9 7.2 8.2 9.6 7.2 8.8 11.3 8.5 9.4 10.5 6.9 6.5 7.5 7.1 13.2 7.7 5.9 5.2 5.6 11.7 6.0 7.8 6.5 The ordered data are as follows: 5.2 5.6 5.9 6.0 6.5 6.5 6.5 6.6 6.8 6.9 6.9 6.9 7.1 7.1 7.2 7.2 7.4 7.5 7.5 7.7 7.7 7.8 7.9 8.0 8.0 8.1 8.2 8.2 8.2 8.4 8.5 8.8 9.0 9.2 9.4 9.5 9.5 9.6 9.7 9.9 10.1 10.5 10.5 10.6 11.1 11.3 11.7 13.2 13.5 13.5 1. Dot plots (Figure 2.8, page 42) 2. Stem-and-leaf displays (stem plots), (Figure 2.9, page 43). Splus software produces the following stem-plot. N = 50 Median = 8.05, Quartiles = 7.1, 9.6 Decimal point is at the colon 5 : 269 6 : 055568999 7 : 11224557789 8 : 001222458 9 : 02455679 10 : 1556 11 : 137 12 : 13 : 255 7
3. Histograms (Figure 2.10, page 44). Using Splus, we have the following histogram for R&D Data. 4 6 8 10 12 14 0 2 4 6 8 10 12 14 RAD Discussion about Dot plot, Stem plot and Histogram: Histogram provides a good visual descriptions of data sets, but it can not identify individual measurements. In contrast, each of the original measurements is visible to some extend in dot and stem plot. The stem plot arranges the data in ascending order, so, it is easy to locate the individuals (measurements). However, it can become cumbersome or diminishing the usefulness of the visual display for a very large data sets. Histogram is very useful for describing the large data sets when the overall shape of the distribution of measurements is more important than the identification of individual measurements. For large data set use histogram and small

