Frequency table: class width=(highest value)-(lowest value)/# of classes Scattergram: direction, form, strength, unusual features Mode The value that occurs most frequently or the frequency class with the highest frequency. A data set may be unimodal, bimodal, multimodal, or have no mode Midrange The value midway between the highest and lowest values in the original data set Midrange = (highest score + lowest score)/2 Interquartile range (IQR)= Q3-Q1 Mean: = Σ y/n=Total/n Standard dev: s=√(Σ(y- ) 2 /(n-1)) Five # summary: max,Q3, median, Q1, min (w/boxplot) STAT Calc 1-Var stats Median and mode not affected by outliers Measures of spread w/measures of center: -Midrange and Range -Median and Interquartile Range -Mean and Standard Deviation Left-skewed: maj. Of data fall to left of mean and cluster @ lower end of dist. Right-skewed: maj. Of data fall to right of mean and cluster @ upper end of dist Z-score (standardized value): z=(y- )/ s (the center is changed by becoming 0, The spread is changed; the standard deviation becomes 1, The shape of the distribution doesn’t change.) (Ordinary values: z-score between -2sd and 2sd) 68-95-99.7 Rule Find normal percentile range: 2 nd DISTR normalcdf If z has a standard normal distribution: – P(a< z < b) = normalcdf ( a , b ) -To find P( z < a ), enter normalcdf ( -5 , a ) -To find P( z > a ), enter normalcdf ( a , 5) Correlation coefficient= r=Σz x z y /(n-1) Must fit straight enough condition, outlier condition (report with and w/o) Meas. Strength of lin. Assn. 1. -1 ≤ r ≤ 1; the sign tells the direction of the association. Values of exactly ±1 are rare. 2. The value of r does not change if all values of either variable are converted to a different scale (like changing unit of measure or using z- scores). 3. The value of r is not affected by the choice of x and y . Interchange x and y and the value of r will not change. 4. r measures the strength of a linear association. 5. r has no units 6. The value of r is sensitive to outliers. On calc.: STAT Calc LinReg(ax+b) (enter, enter) r 2 gives the fraction of the data’s variation accounted for by the model, • And 1 - r 2 is the fraction of the original variation left in the residuals x is the independent variable (predictor variable) ŷ is the dependent variable (response variable) ŷ = b 0 + b 1 x (b 0 =y-int., b 1 =slope) ŷ = mx + b, (algebra text) or ŷ = ax + b (calculator) b 1 =r s y /s x b o = -b 1 x Residual=e=data-model=y- ŷ Std. Dev. For residuals: S e =√(Σe 2 /(n-2)) Plotted should stretch horizontally with even scattering throughout, no bends, no outliers(boring) Graph of z-scores: z y =rz x Sample Surveys Sample data must be collected in an appropriate way, such as through a process of random selection. (ONLY THE # SAMPLED MATTERS, not the population) Sampling frame is a list of individuals from which the sample is drawn. It must match the population to avoid the risk of bias Parameter a numerical measurement describing some characteristic of a population Statistic a numerical measurement describing some characteristic of a sample Systematic Sampling

