Often, pictures tells entire story of data
Have different plots for the different sorts of variables
A graph (or graphic) is any visual display of numbers
The goal of a graph is to
Summarize information from a set of data into a picture that is
easy to u
Twitter is a social networking app that can be used by anyone and can be used for anything.
Twitter was founded in 2006 which is not that long ago, but since 2006 it has been an uproar.
More and more people join twitter and are activ
Important question facing researchers: what data should I collect?
Text gives example of wanting to do a study attempting to address
the question Are people with larger brains more intelligent?
What to do?
How do you measure brain size?
How do you measur
Look for key phrases
No one has more/better/lower
One of the best
Compared to a leading brand
If the presenter stands to gain, expect bias
Suppose that 4 different university research teams compare
Products A to B
In one study B has a slight
Measurement is valid if it a relevant representation of the property
Example: Number of alcohol related deaths in BC since the
proliferation of private liquor stores Makes no sense ignores pop.
Growth Use a rate instead (count/pop)
How would you decrease variability?
Get a better machine
Take several measurements and take the average
Basically, this chapter deals with thinking about what you are
Not everything you are told is true
Not everything you are told is even poss
Interested in something about a population
Population is a collection of individuals
Describe individuals with data
Data sets contain information/facts relating to individuals
Variables are attributes of an individual (e.g., hair color, pain severity,
Two types of variables:
Categorical Variables: each individual falls into a category
(ethnicity, machine works or does not, )
A special type of categorical data is ordered categorical
Categories are ordered in a natural way
Can apply ideas of >
A USA Today (Jan. 4, 2000) poll asked Americans who earn $35,000
or less how they expected to accumulate a $500,000 retirement nestegg.
Variable values are the category labels
Each category must appear on the plot
Percentage of area of pie covered by pie
Last day, looked at a variety of plots
For categorical variables, most useful plots were bar charts and pie
Looked at time plots for quantitative variables
Key thing is to be able to quickly make a point using graphical
For categorical data of any kind, we can summarize the distribution
Count the number of times each value is observed
Count is often called a frequency
Often compute percentages from the counts
Identify all of the values the variable can take