Density Curves and Normal Distributions
Density Curves
Height of bars on a histogram show the number or percentage of observations in
each interval
Density curves instead show the proportion of data falling in each interval, so
that the total area of th

Describing Distributions with Numbers
Towards greater precision
o A numerical way of summarizing data
Measures of center are useful for comparing distributions and measures of
spread allow us to compare variations
The Mean
Mean of x = sum of the xs / num

Least Squares Regression
Summary of linear relationship between X and Y represented by the equation y =
a + bx
o a (y-intercept): y-value when x = 0
o b (slope): change in y when x increases by 1
If positive, y increases with x
If negative, y decreases

Scatterplots and Correlations
To speak of a relationship between two variables, usually means that one is
thought to influence the other
In the case of education and income, for example, we say that income is the
response or dependent variable and educati

Statistical Issues in Data Production
The validity of inference from data depends on he research design that produced
them:
In observational studies, such as social surveys, causality is ambiguous due to the
possibility of lurking variables
In experimen

Sampling Distributions
A sampling distribution summarizes the results based on drawing repeated
samples of the same size from a larger population
o Population Statistics: use mean () and standard deviation ()
o Sample Statistics: use mean (x bar) and stan

Contingency Tables
Two categorical variables can be arranged in a table in which each cell
represents the number of observations in each combination of categories
o The last row and last column show the marginal distribution of each
variable
o The relatio

Review for First Exam
23 questions on Midterm #1
Displaying Distributions
Categorical variables
o Bar graphs
o Pie charts
Quantitative variables
o Histograms
o Stemplots (stem and leaf displays)
Shows the same visual representation as histogram just ti

Confidence Intervals
Though in practice, we dont know the population characteristics, we know that
repeated SRSs of the same size are normally distributed with mean = and SD =
/n
In reality only one sample is taken and used as an estimate of the populatio

Displaying Distributions
*Statistics part science, part art form often highly interpretive.
Lying with Statistics
Possible to prove just about anything?
Playing on the ignorance of others
Given the raw data and ability to examine them graphically, care