Lecture 8: Data and Summarizing Data
Readings: Chapter 1, Chapter 2.1,2.2
Apr 6, 2012
1
Basic Concepts
Statistics and Data
Statistics
: Statistics is the study of the collection, organization, analysis, and interpre
tation of data. It deals with all aspects of this, including the planning of data collection
in terms of the design of surveys and experiments.
Data
: measurements from which information and knowledge are derived.
Dataset
: a collection of data, usually put in table form.
Element
: a single cellin a dataset.
Observation
: a subject on which data is being collected, makes up the rows of a dataset.
Variable
: any characteristic of an observation, makes up the columns of a dataset.
Example of a dataset
Types of data
Categorical/Qualitative
: places an individual into one of several groups (gender, eye
color, college major,hometown,Level of satisfaction etc.)
– Nominal
: order does
NOT
matter (gender, race, marital status)
– Ordinal
: order
DOES
matter (Level of satisfaction, class [fresh, soph, jr, sr])
Continuous/Quantitative
:attaches a numerical value to a variable so that adding or
averaging the values makes sense (height, weight, age, income, yield, etc.)
– Interval
: Differences are interpretable; no natural zero (SAT score).
– Ratio
: Differences and ratios are interpretable; natural zero (height, weight, age).
Excise 1
: What type of variable is...
1
•
Smoking status
•
SAT score
•
Income
•
level of satisfaction
•
GPA
•
clothing size
•
time it take to run a mile
CrossSectional
: Observes many subjects at the same time or approximately the same
time.
Time series
: Observes one subject or many subjects over several time periods.
CrossSection
Time Series
Temperature of Jan 1
st
in each state
Temperature on the first of every
month
Profit of each store in
Tippecanoe Mall earned
on Black Friday
Profit of a store for each day from
Thanksgiving to Christmas
Resting
Hear
rate
of
each
person
in
class
room
Heart rate before, during, and im
mediately after vigorous exercise
Number of car accidents
on Friday afternoon in
each county of IN
Number of car accidents each week
end
of
Football
Season
in
West
Lafayette
2
Ways to Get Data
Sources of data
•
Existing Sources
: data that are there already exist.
•
Surveys
: Ask people their opinion or for any information.
•
Experiments
: deliberately imposes some treatment on individuals in order to observe
their re sponses (Make the individuals do something in particular)
•
Observational Studies
: investigators observe individuals and measure variables of in
terest without assigning treatments to the subjects.
The treatment that each subject
receives is determined beyond the control of the investigator.
