MAT 2379, Introduction to Biostatistics, Section 9.1 (Part 1) 1 MAT 2379, Introduction to Biostatistics Chapter 9. Introduction to statistics 9.1 Random Sampling and Data Description We will start by examining the question: What is a statistical problem? For example, a researcher is examining a group of 50 people of age 18+ and is interested in examining the factors which are associated with the development of heart disease. These factors include: age, weight, smoking status, and the family history. This problem has all the characteristics of a statistical problem: 1. Associated to this problem, there is a large group of objects about which we have to draw a conclusion. This group of objects is called the population . In the example above, the population consists of all people of age 18+. 2. We are interested in certain characteristics of the members of the population. These are called variables and are denoted by the capital letters X, Y, Z, etc. In the example above, X =age, Y = weight, Z =smoking status. 3. The population is too large to study. So, we must draw conclusions by studying only a portion of the population called a sample . The number of objects in the sample is called the sample size and is denoted by n . In the example above, n = 50. The observed values of the variable

