STATS PRELIM 1 NOTES

Statistic - A way of reasoning, along with a collection of tools and methods, designed to help us understand the world Categorical Variable- A variable that names categories (whether with words or numerals) Quantitative Variable - A variable in which the numbers act as numerical values. They ALWAYS have units. HISTOGRAMS don’t display categorical data ------ BAR CHARTS don’t display quantitative data. 1. Adding (or subtracting) the same number to each data value in a variable shifts all measures of center – mean, median, midrange - by the amount added (or subtracted) 2. Adding (or subtracting) the same number to each data value does not change measures of spread – SD, IQR, range-. 3. IF SYMMETRIC then MEAN= MEDIAN RANGE - Difference between the lowest and highest values, Range= max- min Resistant - A calculated summary is said to be resistant if it is affected only a limited amount by outliers. The Normal model is a special, unimodal , and symmetric probability model. It is characterized by its mean and Standard Deviation. - about 68% of the values fall within 1 standard deviations of the mean - 95% of the values fall within 2 standard deviations of the mean. - 99.7% - almost all- of the values fall within 3 standard deviations of the means - A.K.A the 68-95-99.7 Rule. Probability models are a concise way to describe the overall pattern of a distribution. - Any model lies above the HORIZONTAL axis and has an AREA OF ONE - The median has half the area of the probability model on either side - The mode is the peak of the model - The mean is the balance point Standardizing Data - Standardizing uses the standard deviation as a ruler to measure distance from the mean, creating z-scores. We standardize to eliminate units. - Using z-scores, we can compare apples and oranges- values from different distributions or values based on different units. - Z-score can identify unusual or surprising values among data. Z-Score – Tells how many standard deviations a value is form the mean; z-scores have a mean of zero and a standard deviation of one. A z-score between 1 and –1 is uncommon, but a z-score of plus or minus 3 is more rare. Any higher number calls for attention. 5- number summary- The extremes (min and max), Quartiles Q1 and Q3, and the median. ONCE we have the five summaries, we can display a bloxplot. - IQR = Q3- Q1 - Upper fence = Q3 – 1.5(IQR) - Lower fence = Q1 – 1.5(IQR) - Range = Max – Min When comparing or describing distributions of several groups, consider there: - SHAPE (Modes- Unimodal, Bimodal, Multimodal) (No mode = Uniform) - CENTER (Determine the Mean, Symmetric, Skewed, Always mention Outliers) - SPREAD (IQR and SD) Association Between two quantitative variables: - Direction : A positive direction or association means that as one variable increases, so does the other. When increases in one variable generally correspond to decreases in the other, the association is negative - Form : The form we care about is straight, but you should certainly describe other patterns you see in scatter plots.

