“JUST THE MATHS” UNIT NUMBER 18.1 STATISTICS 1 (The presentation of data) by A.J.Hobson 18.1.1 Introduction 18.1.2 The tabulation of data 18.1.3 The graphical representation of data 18.1.4 Exercises 18.1.5 Selected answers to exercises

UNIT 18.1 - STATISTICS 1 - THE PRESENTATION OF DATA 18.1.1 INTRODUCTION (i) The collection of numerical information often leads to large masses of data which, if they are to be understood, or presented eﬀectively, must be summarised and analysed in some way. This is the purpose of the subject of “Statistics” . (ii) The source from which a set of data is collected is called a “population” . For example, a population of 1000 ball-bearings could provide data relating to their diameters. (iii) Statistical problems may be either “descriptive problems” (in which all the data is known and can be analysed) or “inference problems” (in which data collected from a “sample” population is used to infer properties of a larger population). For example, the annual pattern of rainfall over several years in a particular place could be used to estimate the rainfall pattern in other years. (iv) The variables measured in a statistical problem may be either “discrete” (in which case they may take only certain values) or “continuous” (in which case they make take any values within the limits of the problem itself. For example, the number of students passing an examination from a particular class of students is a discrete variable; but the diameter of ball-bearings from a stock of 1000 is a continuous variable. (v) Various methods are seen in the commercial presentation of data but, in this series of Units, we shall be concerned with just two methods, one of which is tabular and the other graphical. 18.1.2 THE TABULATION OF DATA (a) Ungrouped Data Suppose we have a collection of measurements given by numbers. Some may occur only once, while others may be repeated several times. If we write down the numbers as they appear, the processing of them is likely to be cum- bersome. This is known as “ungrouped (or raw) data” , as, for example, in the following table which shows rainfall ﬁgures (in inches), for a certain location, in speciﬁed months over a 90 year period: 1
TABLE 1 - Ungrouped (or Raw) Data 18.6 13.8 10.4 15.0 16.0 22.1 16.2 36.1 11.6 7.8 22.6 17.9 25.3 32.8 16.6 13.6 8.5 23.7 14.2 22.9 17.7 26.3 9.2 24.9 17.9 26.5 26.6 16.5 18.1 24.8 16.6 32.3 14.0 11.6 20.0 33.8 15.8 15.2 24.0 16.4 24.1 23.2 17.3 10.5 15.0 20.2 20.2 17.3 16.6 16.9 22.0 23.9 24.0 12.2 21.8 12.2 22.0 9.6 8.0 20.4 17.2 18.3 13.0 10.6 17.2 8.9 16.8 14.2 15.7 8.0 17.7 16.1 17.8 11.6 10.4 13.6 8.4 12.6 8.1 11.6 21.1 20.5 19.8 24.8 9.7 25.1 31.8 24.9 20.0 17.6 (b) Ranked Data A slightly more convenient method of tabulating a collection of data would be to arrange them in rank order, so making it easier to see how many times each number appears. This

