Final Exam Study Guide

Chapter 1 o Definition of statistics Statistics is a collection of procedures and principles for gathering data and analyzing information to help people make decisions when faced with uncertainty. Chapter 2 o Raw data- numbers and category labels that have been collected but have not yet been processed o Population versus sample-We generally want to describe a population using statistics but it is unrealistic to measure variables on every observational unit in the population. A subset of the population from which we can gather information is called a sample o Parameter versus statistic- Values from the entire p opulation are called p arameters. Values from the s ample are called s tatistics o Variable types and roles Types of Variables Categorical – group or category names that don’t necessarily have any logical ordering Color of M&M’s Gender Stat 200 Section Ordinal – categorical variable where values or categories have a natural ordering Rate the roller coaster on a scale of 1-5 (1 is terrible and 5 is excellent) Age groups (child, teen, adult, senior citizen) Shirt sizes (S, M, L, XL) Quantitative – numerical values taken on each individual Height Temperature # of Red M&M’s Possible Roles Played by Variables: Response Variables – the variables of which we want to determine the outcome. These are the variables of main interest. Explanatory Variables – variables that partially explain the value of the response variable for the individual o How to summarize categorical variables Numerical summaries: Frequency – count Relative frequency – the count in a category relative to the total, count over all categories (%) Frequency distribution – a listing of all categories along with their frequencies Relative frequency distribution – a listing of all categories along with their relative frequencies Visual Summaries: Pie chart – use for one categorical variable with few categories Bar graph – use for one or two categorical variables o How to summarize quantitative variables Summary Features: Distribution – the overall pattern of how often the possible values occur, defined by the location, spread and shape of the data. Location – the center or average value, defined by the median or mean Spread – the variability defined by the range, interquartile range, standard deviation or variance Shape –

