1 Data
Systematically recorded information
Interquartile Range - resistant?
Horizontal line representing a variable and a number scale imposed for the values of the variable.
the process, intervention, or other controlled circumstance applied to randomly assigned experimental units
a representative subset of a population, examined in hope of learning about the population
The systematic study of uncertainty denoted as # successes divided by # of trials
Most common graph of distributions with one quantitative variable.
4 Sample space
Collection of all possible outcomes
standard deviation
describes spread. Square root of deviance
any systematic failure of a sampling method to represent its population; common errors are voluntary response, undercoverage, nonresponse ____, and response ____
tells how many standard deviations a value is from the mean; have a mean of zero and a standard deviation of one
a graphical display that shows the relationship between two quantitative variables
type of bias that is problematic because some groups are not represented in the sample
one of three ways to describe distributions; described by range, quartiles, interquartile range, outliers, variance, standard deviation
Response variable
Measures an outcome of a study.
Multistage Sample Design
Restricting random selection by choosing the sample in stages. Selecting successively smaller groups within the population in stages. Each stage may employ an SRS, a stratified sample, or another type of sample.
2 R^2
Overall measure of how successful th eregression is in linearly relating y to x
Categorical Variable
Places an individual into one of several groups or categories.
experimental units
individuals on whom an experiment is performed
stratified random sample
population is divided into homogenous groups and then a random sample is drawn from each group
pie chart
This must include all the categories that make up a whole. Use this only when you want to emphasize each category's relation to the whole. It is awkward to make by hand, but computers do it easily.
in a graph of data, an individual observation that falls outside the overall pattern of the graph
The sum of the squares of the deviations of the points about this regression line.
Sum for squares of error.
SSE = ∑(y - ybar)^2
Wording of questions
Confusing or leading questions can introduce strong bias, and even minor changes in wording can change a survey's outcome.
P(A U B)
P(A) + P(B) - P(A /\ B)
influential point
when omitting a point from the data results in a very different regression model, the point is an ____
advantage of stemplot
retains the actual data values from the data set
linear transformation
a change in the measurement unit; it changes the original variable x into the new variable x-new given by an equation
quantitative variable
one of two types of variables; takes numerical values; anything that can be manipulated with arithmetic; displayed with dotplots and histograms (ex. numerical test scores) (ex. weight in lbs)
Q1 = the median of the observations to the left of M. Q3 = the median of the observations to the right of M.
2 Ladder of Powers
Places an order to any re-expression we must do
Disjoint (mutually exclusive)
two events are disjoint if they share NO outcomes in common
Variance s²
s² = [(x₁ - ¯x¯)² + (x₂ - ¯x¯)² + ... + (x₇ - ¯x¯)²] / (n-1), s² = [1/(n-1)]∑(x₁ - ¯x¯)²
3 Voluntary Response Bias
When we receive biased results due to the responses not received
least squares regression line
also know as the regression line or line of best fit it is the line that minimizes the sums of the squares of the vertical distances from the actual points to the line
4 Margin of Error
tells you the "give or take" from a confidence interval
simple random sample (SRS)
in a sample of size n, every group of n is equally likely to be chosen
