# notes_chapter3.pdf - Chapter 3 Numerical summaries for data...

• Notes
• 9

This preview shows page 1 - 3 out of 9 pages.

Chapter 3Numerical summaries for data3.1IntroductionSo far we have only considered graphical methods for presenting data. These are always usefulstarting points. As we shall see, however, for many purposes we might also requirenumericalmethods for summarising data: perhaps one or two numbers can summarize the key informationabout location and variability in the data. Before we introduce some ways of summarising datanumerically, let us first think about some notation.3.2Mathematical notationBefore we can talk more about numerical techniques we first need to define some basic notation.This will allow us to generalise all situations with a simple shorthand.Very often in statistics we replace actual numbers with letters in order to be able to write gen-eral formulae. We generally use a single letter to represent sample data and use subscripts todistinguish individual observations in the sample. Amongst the most common letters to use isx, althoughyandzare frequently used as well. For example, suppose we ask a random sampleof three people how many mobile phone calls they made yesterday. We might get the followingdata: 1, 5, 7. If we take another sample we will most likely get different data, say 2, 0, 3. Usingalgebra we can represent the general case asx1,x2,x3:1st sample1572nd sample203typical samplex1x2x3This can be generalised further by referring to the dataas a wholeasxand theith observationin the sample asxi. Hence, in the first sample above, the second observation isx2= 5whilstin the second sample it isx2= 0. The lettersiandjare most commonly used as the indexnumbers for the subscripts.The total number of observations in a sample is usually referred to by the lettern.Hencein our simple example aboven= 3.36
CHAPTER 3. NUMERICAL SUMMARIES FOR DATA37The next important piece of notation to introduce is the symbol. This is the upper case of theGreek letter “sigma”. It is used to represent the phrase “sum the values”. This symbol is usedas follows:nsummationdisplayi=1xi=x1+x2+· · ·+xn.This notation is used to represent the sum of all the values in our data (from the firsti= 1tothe lasti=n), and is often abbreviated toxwhen we sum over all the data in our sample.Two other mathematical basics need to be introduced. First, the use of powers is importantin many statistical formulae. We all know that, for example, the square of three means raising3to the power2, i.e.32= 3×3 = 9. This can be generalised toxk, which means multiplyingxby itselfktimes.The other important idea is the use of brackets. Brackets are used to impose an ordering onthe way operations are carried out. The operation inside the bracket is carried out before theone outside. Consider the following three cases:3 + 42= 1932+ 42= 25(3 + 4)2= 49.In the first case, we simply square4and then add this to3. In the second case, we square bothnumbers and then add them together, while in the third case, because of the brackets, we add the