Terms  Definitions 

Mean  resistant? 
Not Resistant

Strata 
The population divided into subgroups.

Confounding 
Two variables (explanatory variables or lurking variables) are confounded when their effects on a response variable cannot be distinguished from each other.

statistic 
value calculated from data to summarize aspects of the data

residual 
the difference between the actual value of y and the predicted value of y for a specific x value

variable 
the characteristic of the individuals; can take on different values for different individuals; 2 types are categorical and quantitative (ex. in an experiment measuring hippo height differences, height is the ________)

Scatterplot 
A scatterplot shows the relationship between two quantitative variables measured on the same individuals. The values of one variable appear on the horizontal axis, and the values of the other variable appear on trhe vertical axis. Each individual in the data appears as the point in the plot fixed by the values of both variables for that individual. Explanatory variable always on the xaxis. Explanatory variable = x, response = y. If there is no explanatoryresponse variable relationship, either variable can go on the horizontal axis.

event 
any collection of outcomes from the sample space of a chance experiment

retrospective study 
an observational study in which subjects are selected and then their previous conditions or behaviors are determined

variance 
the standard deviation squared, it is a measure of spread

correlation 
the numerical measure r that measures the strength and direction of the linear relationship

outlier 
an individual value that falls outside the overall pattern

stemplot 
used for small sets of quantitative data; numbers arranged smallest on top, largest on bottom; don't skip #s; cutoff last digit of every #; eliminate duplicates; makes a histogram

Median 
Arrange all observations in order of size, from smallest to largest. If n = odd, M is the center observation. If n = even, M is the mean of the two cneter observations in the ordered list.

Convenience sampling 
Chooses the individuals easiest to reach.

1 Contingeny Table 
Displays counts (%) of individuals falling into caegories on two or more variables

Voluntary Response Sampling 
People choose to participate, asking people to participate (an online survey).

shifting 
adding a constant to each data value adds the same constant to the mean, the median, and the quartiles, but does not change the standard deviation or IQR

sampling variability 
the natural tendency of randomly drawn samples to differ

5 number summary 
includes the minimum, first quartile, median, third quartile, & the maximum

yintercept 
the value of the resonse variable when the explanatory variable is zero

Standard deviation 
The average distance that a typical data point would lie from the average of its distribution

normal distributions 
playing a large roll in statistics, these are rather special in the sense of being average or natural. There are described by a special family of bellshaped, symmetric density curves called Normal curves.

r^2 in Regression 
The coefficient of determination, r^2, is the fraction of the variation in the values of y that is explained by the leastsquares regression of y on x.
(SSM  SSE)/SSM 
2 659599.7 Rule 
Says 68% fall within 1 standard deviation, 95% of data fall within 2 standard deviations, 99.7% of data fall with in 3 standard deviations

Null hypothesis 
The hypothesis that states there is no difference between two or more sets of data.

reexpress data 
we do this by taking the logarithm, the square root, the reciprocal, or some other mathematical operation on all values in the data set

single blind 
when the subjects in an experiment do not know if they are in the treatment or control group

histogram 
This breaks the range of values of a variable into classes and displays only the count or percent of the observations that fall into each class.

statistically significant 
the results are too unusual to have occurred by chance

Extrapolation 
The use of a regression line or curve for prediction outside the domain of values of the explanatory varaible x that you used to obtain the line or curve. Such predictions cannot be trusted.

2 standard deviation 
measure of spread  tells the average distance each value is away from the mean

changing center and spread 
doing this is equivalent to changing its units

shape, center, and spread 
the three descriptions of the overall pattern of distribution

Degrees of freedom 
n1 of the squared deviations can vary freely.

3 Pilot 
Small trial run of a survery to see if questions are clear

density curve 
This is a curve that is always on or above the horizontal axis and has area exactly 1 underneath it. This describes the overall pattern of a distribution. The area under the curve and above any interval of values on the horizontal axis is the proportion of all observations that fall in that interval.

simple random sample 
this of sample size n is one in which each set of n elements in the population has an equal chance of selection

Slope of the LeastSquares Regression Line 
b = r (Sy/Sx)
This equation says that along the regression line, a change of one standard deviation in x corresponds to a change of r standard deviations in y. 
Leave a Comment ({[ getComments().length ]})
Comments ({[ getComments().length ]})
{[ comment.comment ]}