Unformatted text preview: Lessons in Business Statistics
Prepared By
P.K. Viswanathan Chapter 2: Classifying Data to
Convey Meaning Introduction
When managers are bewildered by plethora of
data, which do not make any sense on the surface
of it, they are looking for methods to classify data
that would convey meaning. The idea here is to
help them draw the right conclusion. This chapter
provides the nittygritty of arranging data into
information. 1) Meaning and Example of
Raw Data
Meaning of Raw
Data:
Raw Data represent numbers
and facts in the original
format in which the data have
been Collected. You need to
convert the raw data into
information for managerial
decision Making. Example of Raw Data:
Assume that you know the weekly
sales of a product in a region over the
past year are: (Figures in '000' units)
52 61 59 55 63 70 59 77 81
83 69 91 73 83 90 81 77 77
74 65 56 77 64 49 60 52 50
45 42 46 39 29 38 41 43 23
26 27 22 29 31 29 31 30 30
29 40 44 45 46 52 53
Suppose you present this set of data
as it is to the General Manager
(Sales). At best it will be boring to
him. Information is Key
Large and massive raw data tend to bewilder you so much
that the overall patterns are obscured. You cannot see the
wood for the trees. This implies that the raw data must be
processed to give you useful information. Raw Data Process Information 2) Frequency Distribution
In simple terms, frequency distribution is a summarized
table in which raw data are arranged into classes and
frequencies. Classes represent categories or groupings, which
contain a lower limit and an upper limit. Classes are formed
conveniently following certain guidelines. Against each class,
you count and then place the number of observations that fall
into it. When you do it for all classes in a given data analysis
problem, it becomes a frequency distribution.
Frequency distribution focuses on classifying raw data into
information. It is the most widely used data reduction
technique in descriptive statistics. When you are looking for
pattern that would help you understand the characteristic you
measure in a problem situation, frequency distribution comes
to your rescue. Guidelines for Constructing a Frequency
Distribution Table
1) Identify the Minimum Value
(Min) and Maximum Value (Max)
in the given Data Set. Calculate
Range = MaxMin 3) Determine the Width of the
Class Interval =
Range/ Number of Classes 2) Decide on the Number of Classes
you would like to have. The
number of classes can be
determined as the square root of
the number of observations in the
data set.. Also for any problem it
is recommended that you have not
less than 5 classes and not more
than 15 classes. 4) Formulate the Boundaries of
the Classes in such a manner
that it will include all the
observations in the data set.
Avoid overlapping of classes.
Once class boundary for each
class is ready, all you need to
do is to tally the number of
observations in each class. 3) HISTOGRAM
Histogram (also known as frequency
histogram) is a snap shot photograph of
the frequency distribution. Histogram is a
graphical representation of the frequency
distribution in which the Xaxis
represents the classes and the Yaxis
represents the frequencies. Rectangular
bars are constructed at the boundaries of
each class with heights proportional to
the frequency.
Histogram depicts the pattern of the distribution emerging from the
characteristic being measured. If the pattern is symmetrical and bell shaped,
then it reflects the normal distribution curve. In the quality control parlance,
the system is stable; only chance causes are present and the assignable
causes are absent. Role of Histogram in Practice Histogram Example
The inspection records of a hose assembly operation revealed a high level
of rejection. An analysis of the records showed that the "leaks" were a
major contributing factor to the problem. It was decided to investigate the
hose clamping operation. The hose clamping force (torque) was measured
on twenty five assemblies. (Figures in footpounds). The data are given
below: Draw the frequency histogram and comment.
8
13
15
10
16
11
14
11
14
20
15
16
12
15
13
12
13
16
17
17
14
14
14
18
15 Histogram Example Solution
Histogram for the Example Fr e qu e nc y You will notice that the Range is 20 8
=12. You take the number of classes as
15
12
5(Note that the square root of the number
of observations is = 5). The width of
25
10
7
the class is Range/Number of classes =
12/5 =2.4. Round it to 3. You can now
3
5
2
1
form the boundaries of the classes
starting with 8 and then incrementing by
0
3 successively the lower limit of each
811 1114 1417 1720 2023
class until all the classes are formed.
Classes
Tally the number of observations under
each class. This would give you the
Looking at the histogram, it is easy for you to
following table of frequency distribution. see that the pattern does not show a bell shape
Class
811
1114
1417
1720
2023 Frequency
2
7
12
3
1 curve. The bars adjacent to the class 1417
cause some distortion to normality. It is also
evident that the average is in the range 14 to 17.
Corrective action is needed. However, before
taking any action, you must be cautious about
the fact that the sample size here is only 25
observations. Take more measurements and
draw the histogram again before taking
corrective steps. Microsoft Excel and Histogram The Microsoft Excel Chart Wizard allows you to create a variety of charts
for numerical as well as categorical data. The histogram pictured in the
previous slide is an output from Chart Wizard. Also there is a powerful utility as addin supplied by Microsoft Excel
called "Data Analysis" in the Tools Menu. This has a variety of analysis
tools, which include Histogram, Cumulative Distribution, Frequency
Distribution, Descriptive Statistics, ParetoChart and many others.
Please get familiarized with these in Excel at the earliest so t hat you could
function as a manager taking information based decisions. The po wer of
Excel spread sheet software is amazing. 4) Cumulative Frequency Distribution
A type of frequency distribution that shows how many
observations are above or below the lower boundaries
of the classes. You can formulate the following from
the previous example of hose clamping force(torque)
Class Frequency Relative
Frequency 811
1114
1417
1720
2023 2
7
12
3
1 0.08
0.28
0.48
0.12
0.04 Total 25 1.00 Cumulative
Frequency
2
9
21
24
25 Cumulative
Relative
Frequency
0.08
0.36
0.84
0.96
1.00 Ogive Curve
The Ogive curve is a graphical
representation of the cumulative frequency
distribution using numbers or percentages.
In this pictorial representation, less than
values are in the Xaxis and cumulative
frequency in numbers or percentages are
in the Yaxis. A line graph in the form of a
curve is plotted connecting the cumulative
frequencies corresponding to the upper
boundaries of the classes. Today, this
ogive graph is elegantly and efficiently
obtained as output from Chart Wizard or
Data Analysis in the Toolbox of Microsoft
Excel. The Ogive graph for the present
torque example obtained from Microsoft
Excel is given in the adjacent box: Cumulative Distribution(Ogive Curve) for
the Example
30
Cumulative 20
Frequency 10 25 24 21
9
2 0
11 14 17 20 Torque(less than value) 23 ...
View
Full Document
 Spring '12
 gray
 Business, Normal Distribution, Frequency, Probability distribution, Histogram, Cumulative distribution function

Click to edit the document details