# Chapters1and2 - Chapters 1 and 2 Summary and Display of...

• Notes
• 8

This preview shows page 1 - 4 out of 8 pages.

1 Chapters 1 and 2: Summary and Display of Univariate Data Let’s start our introduction to data with a n example. Suppose that people are interviewed at UBC and asked a few questions about their study habits. The results are recorded in the following data table. Subject Age Gender Hours Spent Studying / Week Most Common Study Location Stress Level 1 19 M 2 IBLC Low 2 20 M 4 Koerner Low 3 20 F 18 IBLC Medium 4 28 M 10 Coffee Shop Medium 5 21 F 7 IBLC Low 6 21 F 4 IBLC Low 7 20 F 5 Koerner Medium 8 21 M 7 IBLC Low This is the rawest representation of data. When your data set is small, sometimes the best summary of the data is simply the data itself it’s easy enough to read and digest this table by just looking at it. However, more typically, we’re going to have a big ugly data set, so we need some nicer ways to summarize it: both numerically and with pictures. This is the ‘theme’ of chapters 1 and 2. First, let’s note that data does not have to be numerical; it can be a label as well.
2 Types of Variables What is a variable? Let’s note that we’re defining ‘variable’ in a different way from the typical ‘Math’ variable. So… - A variable is, for our purposes at this point, a set of data describing one characteristic of the measured object or individual. Each of ‘Subject’, ‘Age’, ‘Gender’ etc. are all variables. We can separate variables into two broad types: categorical and quantitative. - Categorical variables, as the name implies, denote a ‘category’ , and can be both ordinal and nominal. o An ordinal categorical variable has an implied ordering (eg. Stress level), but we do not necessarily know the ‘distance’ between the categories. o A nominal categorical variable simply denotes different labels (eg. Study location) - Quantitative variables denote quantities, or numerical, data. There are two types of quantitative variables: discrete and continuous. o If a variable has a countable range, then we call it a discrete variable. o If a variable has an uncountable range, then we call it a continuous variable. Note that there is a strange grey area between discrete and continuous variables, both as data and both as we will treat it later. Eg: In our set of data, age is recorded as an integer, so perhaps it’s better characterized as a discrete variable however, we could imagine that the variable age could be measured to some arbitrary degree of precision, so perhaps it’s better characterized as a continuous variable. For most cases, though, either the distinction is obvious, or we are comfortable in assuming that it isn’t so important … don’t worry about it too much for now. Just because a variable is represented with numbers does not mean it is a quantitative variable! The entries for ‘subject’ are numbers from 1 to 8, but they do not have any particular quantitative meaning they are simply labels saying ‘this row represents the observations for subject 1’ and so on.
3 Population vs. Sample As Statisticians, we must always keep the interplay between the sample we collect and the