# Exam 1 Note Sheet.docx - Exam 1 Note Sheet Statistics the...

• Test Prep
• 2

This preview shows page 1 - 2 out of 2 pages.

Exam 1 Note SheetStatistics- the science of converting data into knowledge- ”learning in the presence of variation.” - Population: any well-defined collection of things- Sample: is any subset of a population-Types of StatisticsDescriptive statistics: Summarizing data either graphically or numericallyInferential statistics: Using a sample to make statements (infer) about the properties of a populationEstimation, Testing, Fitting models-Types of DataNumerical: Data that consists of numbersContinuous: Any value in a specified range is possible (measurements)Discrete: Only certain specific values are possible (counts)Categorial: Data that is not numbersNominal: No natural order.Ordinal: A natural order exists, butno associated numbersDescriptive Statistics-HistogramsPurpose: Helps us share data-Frequency HistogramBar height = # of observations in each bin (absolute frequency)● Sum of bar heights =sample size-Relative Frequency HistogramBar Height (proportion, percentage, relative frequency) =Frequency/Sample SizeExample: Bin 1 height = (3/20) = .15Sum of Bar height = (20/20) = 1Density HistogramBar Height (density) = relative frequency/bin widthBar area =relativefrequencyExample:Bin 1height = .15/100 = .0015Sum ofbar area = 1Larger sd on graph: median~meanNumerical Summary Measures-Location (central tendency, center, average)Median: Rank based central value (½ data smaller, ½ data larger)QuartilesEven: the median, find the medians of the two setsOdd: median as the center value and use that value in both sidesPercentiles vs Quartilestaller than 80% = 80th percentileQ1=25th percentile, Q2=50thPercentile vs Critical Valueskth percentile is k% of data below. (100-k)% aboveMean: Arithmetic averageNotations:Population mean (μ), Sample mean (ȳ), Other cases (E(Y))MeanMedianArithmetic averageCompute w/o sortingInfluenced by extreme values (outliers)Sum divided by number of components in data setRank based central valueCompute with sortingInfluenced little by extreme values (outliers)Middle number of data set where half the numbers are below and half above-Variability (dispersion, scatter, spread)Range: largest value-smallest valueIQR (interquartile range):3rd quartile - 1st quartileStandard deviation σ: sqrt of the avg of the squared deviations from each observation in a data set to the mean of the datasetVariance σ2:the expectation of the squared deviation of a random variable from its mean,and it informally measures how far a set of numbers are spread out from their meanLeast affected by outliers:IRQ, median, Q1. SDand mean are heavily affectedProbability-Random Sample
• • • 