# CH1 - STAT 200 Chapter 1 Looking at Data - Distributions...

STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the results and drawing conclusions. Inferences (conclusions) are made about speciﬁc random phenomena on the basis of rela- tively limited sample material. Data and Variables Let’s look at the data extracted from medical records of 50 patients with low back pain: Subject Age Gender In employment? Duration Severity of pain of pain 1 35 F No 3 weeks mild 2 42 F Yes 13 weeks severe 3 21 M Yes 4 weeks moderate 4 59 F No 72 weeks moderate . . . . . . . . . . . . . . . . . . 50 40 M Yes 30 weeks severe A variable refers to a characteristic of interest, e.g. age and gender. A variable can be: 1. qualitative/categorical 2. quantitative (measured on a numerical scale) (a) discrete (integer-valued) (b) continuous (can take on any of a range of values on the number line) 1

variable variable type discrete or continuous? unit Age quantitative discrete years Gender categorical N/A N/A In employment? categorical N/A N/A Duration of pain quantitative discrete weeks Severity of pain categorical N/A N/A HOW TO SUMMARIZE A DATA SET? - by means of graphical displays and descriptive statistics. Displaying Categorical Data (Section 1.1) For categorical data, the key is to group similar things together. I. Frequency tables / Relative frequency tables A (relative) frequency table shows all the categories of a categorical variable together with their (relative) frequencies. The relative frequency is the frequency expressed in percentages. For non-overlapping categories, their percentages should add up to 100%. 2
e.g. City of residence of past 30-day transit riders in the GVRD (city of residence is a categorical variable) II. Bar Charts A bar chart shows rectangular bars each representing a category. The bars have the same width, and their heights represent the frequency or relative frequency. 3

III. Pie Charts A pie chart shows categories as slices in a circle. The area of each slice is proportional to the fraction of the whole for the category it represents. Displaying Quantitative Data (Section 1.1) I. Frequency distribution Frequency distribution divides the data into classes and counts the number of occurrences (frequency) in each class. Constructing a frequency distribution:
