Lecture 2 Section 1.1 Looking at Data There are two components in looking at or describing data: Individuals (units) are objects described by a set of data. A unit can be a person, a place, or a thing (ie. a student, the city of West Lafayette, Purdue University). A variable is any characteristic of an individual. For you, a Purdue college student, it could be your birth date, sex, marital status, major, ……. There are two types of variables: Quantitative variable , takes numerical values for which arithmetic operations such as adding and averaging make sense. An example for you, as Purdue students, would be your age, your GPA, your numeric total score for this course. Categorical variable , places an individual into one of several groups or categories, uses the count or percent of the individuals for each category. An example for you, as Purdue students, would be your Major, gender (Male/Female), your academic year (Fresh, Soph, Junior, Senior/Super- Senior, or 1 st , 2 nd , 3 rd , or 4 th ) The distribution of a variable describes what values that variable takes and how often it takes on that value. If you have more than one variable in a problem, look at each variable by itself first, then look for any relationships between the variables. Examples: Page 1

in the following questions. If it is categorical, state the possible answers. a) What letter grade did you get in your Calculus class last semester? b) What was your score on the last exam? c) What is your GPA? d) Did you vote for Barack Obama? e) Which candidate did you vote for in the last Presidential election? f) How many votes did Barack Obama get? g) h) i) j) We will look at describing data on a single variable by: 1. Graphing it, always a good place to start. 2. Finding numerical summaries. Graphing Methods: Use different graphing techniques for each type of variable. For Categorical variables: 1. Bar graph 2. Pie graph Mention: Bar graphs can be used with both counts and percent of observations, pie charts are percent of observations. Only bar graphs can be used when there are multiple observations/responses on a single unit. Example:
