S TAT 113 Class 2 Data and Measurement The Distribution of a Variable

O BJECTIVE A: D ATA What is data? Values , Labels , or Names that record/contain information about individuals in an orderly fashion, together with a context Example: Each row represents a different individual Each row can be referred to as a case
D ATA AND M EASUREMENT Each column represents a variable A variable is a property for which individuals of a certain type have a value We’ll call the following things values: What you’d normally call values (numbers, measurements) Labels Names

T WO TYPES OF VARIABLES Measurements With units Amounts Make sense to think about the “average” Examples: Height Weight #Bases stolen Miles per hour Types Labels Names Categories Examples: Gender Name Social Security # Favorite Color Quantitative Categorical
V ARIABLES There are two ways of thinking about the word “variable” Consider the variable height (in inches) 1. We can say that the variable height is a property of an individual (ie the distance from their feet and head while standing) 2. We can say that the variable height is an actual list of values , each value corresponding to an individual. In definition 1, a variable is more of a concept In definition 2, a variable contains data Definition 2 more closely corresponds with a column in a data table

Q UANTITATIVE V ARIABLES : T WO T YPES Values are numbers, but cannot fall just anywhere within some range of values Examples: Number of stops a bus makes from Hunter to your home How many points a player scores in a basketball game Values are numbers, and can fall anywhere within some range of values Examples: Distance in feet from Hunter to your home The weight of a basketball Discrete Continuous
D ISCRETE VS C ONTINUOUS How can you tell the difference? Consider two individuals, each with a different value for some variables E.g. Tim and Tom are not the same height, and don’t have the same amount of change in their pockets. For continuous variables: No matter how close Tim’s value is to Tom’s, it’s possible that Tina, another person, has a value for this variable that’s between Tim’s and Tom’s For discrete variables: It’s possible that Tim’s and Tom’s values for these variables are so close, that Tina’s value couldn’t possibly be in between Tim’s and Tom’s

T INA IS TALLER THAN T OM , AND SHORTER THAN T IM 71 69 70 Tim Tom Tina
S UPPOSE T OM WERE TALLER … S TILL WE COULD IMAGINE A THIRD PERSON TALLER THAN T OM AND SHORTER THAN T IM 71 70 69.5

N O MATTER HOW CLOSE T IM AND
