Lecture2014Ch2Fa07

# Lecture2014Ch2Fa07 - Chapter 2 Organizing and Summarizing...

This preview shows pages 1–3. Sign up to view the full content.

Chapter 2 – Organizing and Summarizing Data Definition : When data are in their original form, as collected, they are called raw data . We want to be able to visualize the characteristics of a data set; hence we construct graphical representations of the data. In order to do so, we must look at the frequency of occurrence of data values. Definition : A categorical frequency distribution , used for categorical (qualitative) data, is a table listing the categories, together with the frequency of occurrence of each category in the observed data. Definition : The frequency for a category is the number of data values falling in that category. The relative frequency for a category is the fraction, proportion, or percentage of the data values that fall within that category. Example : The following table shows data on class rank of students receiving financial aid at a small 4- year college. College Class Rank Frequency Relative Frequency Fr 18 18/40 = 0.45 = 45% So 12 12/40 = 0.30 = 30% Jr 6 6/40 = 0.15 = 15% Sr 4 4/40 = 0.10 = 10% Often, when the data are numeric, there are too many different data values for a listing of the raw data to be of use in seeing the characteristics of the data. It is common to divide the interval of values of the data into a relatively small number of subintervals, called classes , and to tabulate the data using the frequencies . Each frequency is the number of occurrences of data values in one of the classes. Definition : A grouped frequency distribution is the organizing of raw data in table form, using classes and frequencies. Definition : The largest data value that can be included in a class is the upper class limit for that class; the smallest data value that can be included is the lower class limit . Definition : The class width is the difference between the upper class limit of one class and the upper class limit of the next-higher class. Definition : The cumulative frequency for a class is the count of all observed data values in that class or in lower classes. Rules for constructing a frequency distribution: 1) The number of classes should be between 5 and 20; 5 for small data sets, 20 for large data sets. “Small” means roughly 25 to 30 observations; “large” means around 1000 or more observations. 2) An observed data value must be in one, and only one, class. This means that the classes must be non-overlapping, or mutually exclusive. 3) The classes must be continuous; even if there are no observed data values in a given class, that class must be included, with a frequency value of 0. 4) The classes must be exhaustive; i.e., together they must include all of the data. 5) The classes must be equal in width.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Procedure for constructing a grouped frequency distribution: 1) Find the range by subtracting the lowest value of the data from the highest. 2) Select the number of classes desired (between 5 and 20).
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 07/28/2011 for the course STA 2014 taught by Professor Staff during the Fall '10 term at University of Florida.

### Page1 / 6

Lecture2014Ch2Fa07 - Chapter 2 Organizing and Summarizing...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online