MGF 1107 – EXPLORATIONS IN MATHEMATICS LECTURE 32 Statistical Sampling Statistics involves forming hypotheses about a given population after having observed only a fraction of the population. For example, TV ratings are based on the viewing habits of a selected number of people. Opinion polls indicate the favorability of political parties by asking a few thousand people for their viewpoint. In order for these statistics to be accurate it is imperative that those chosen for the survey represent the population. By including only females, or those over 40, we do not get a representative sample. In this lecture we will look at the different ways to collect the data in order to perform a valid statistical analysis, as a failure to observe correct procedure can lead to embarrassing results, the most famous example of which being the newspaper headline below from the Chicago Tribune on the 3 rd of November, 1948, which incorrectly used polling data obtained over the telephone to conclude that Thomas Dewey had defeated Harry Truman in the presidential election. Although the polling companies polled a representative number of both sexes, all racial types, and all geographic areas, the results did not take into account that most of the people who owned telephones were Republican, which led to the conclusion that a landslide was imminent for Dewey.

Obtaining a representative sample is not an easy task, and while much research has been done in this area, it remains an inexact science, as indicated by the margin of error in opinion polls. Of the many things that can go wrong, two common errors are selection bias and non-response bias. Selection bias occurs when a particular group within a population is represented to a disproportionate degree. It is not always easy to detect ahead of time, but it can have a significant impact, as shown by the 1948
