Terms  Definitions 

1 Data 
Systematically recorded information

Interquartile Range  resistant? 
Resistant

Dotplot 
Horizontal line representing a variable and a number scale imposed for the values of the variable.

treatment 
the process, intervention, or other controlled circumstance applied to randomly assigned experimental units

sample 
a representative subset of a population, examined in hope of learning about the population

Probability 
The systematic study of uncertainty denoted as # successes divided by # of trials

Histogram 
Most common graph of distributions with one quantitative variable.

4 Sample space 
Collection of all possible outcomes

standard deviation 
describes spread. Square root of deviance

bias 
any systematic failure of a sampling method to represent its population; common errors are voluntary response, undercoverage, nonresponse ____, and response ____

zscore 
tells how many standard deviations a value is from the mean; have a mean of zero and a standard deviation of one

scatterplot 
a graphical display that shows the relationship between two quantitative variables

undercoverage 
type of bias that is problematic because some groups are not represented in the sample

spread 
one of three ways to describe distributions; described by range, quartiles, interquartile range, outliers, variance, standard deviation

Response variable 
Measures an outcome of a study.

Multistage Sample Design 
Restricting random selection by choosing the sample in stages. Selecting successively smaller groups within the population in stages. Each stage may employ an SRS, a stratified sample, or another type of sample.

2 R^2 
Overall measure of how successful th eregression is in linearly relating y to x

Categorical Variable 
Places an individual into one of several groups or categories.

experimental units 
individuals on whom an experiment is performed

stratified random sample 
population is divided into homogenous groups and then a random sample is drawn from each group

pie chart 
This must include all the categories that make up a whole. Use this only when you want to emphasize each category's relation to the whole. It is awkward to make by hand, but computers do it easily.

outlier 
in a graph of data, an individual observation that falls outside the overall pattern of the graph

SSE 
The sum of the squares of the deviations of the points about this regression line.
Sum for squares of error. SSE = ∑(y  ybar)^2 
Wording of questions 
Confusing or leading questions can introduce strong bias, and even minor changes in wording can change a survey's outcome.

P(A U B) 
P(A) + P(B)  P(A /\ B)

influential point 
when omitting a point from the data results in a very different regression model, the point is an ____

advantage of stemplot 
retains the actual data values from the data set

linear transformation 
a change in the measurement unit; it changes the original variable x into the new variable xnew given by an equation

quantitative variable 
one of two types of variables; takes numerical values; anything that can be manipulated with arithmetic; displayed with dotplots and histograms (ex. numerical test scores) (ex. weight in lbs)

Quartiles 
Q1 = the median of the observations to the left of M. Q3 = the median of the observations to the right of M.

2 Ladder of Powers 
Places an order to any reexpression we must do

Disjoint (mutually exclusive) 
two events are disjoint if they share NO outcomes in common

Variance s² 
s² = [(x₁  ¯x¯)² + (x₂  ¯x¯)² + ... + (x₇  ¯x¯)²] / (n1), s² = [1/(n1)]∑(x₁  ¯x¯)²

3 Voluntary Response Bias 
When we receive biased results due to the responses not received

least squares regression line 
also know as the regression line or line of best fit it is the line that minimizes the sums of the squares of the vertical distances from the actual points to the line

4 Margin of Error 
tells you the "give or take" from a confidence interval

simple random sample (SRS) 
in a sample of size n, every group of n is equally likely to be chosen

