Lecture 4

1/22/08 Lecture 4 1 STOR 155 Introductory Statistics Lecture 4: Displaying Distributions with Numbers The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

1/22/08 Lecture 4 2 Exploratory Data Analysis (EDA) Graphical Visualization: Shape Bar Graph Pie Chart Stem plot Histogram Time plot ( not for distribution, but for changing pattern across time ) Numerical Summary: Center and Spread Center: Mean and Median Spread: Quartiles, Five-number summary and Boxplot Standard Deviation Choose one from each category.
1/22/08 Lecture 4 3 What is the average highway (city) mileage? What is the middle value of highway (city) mileage?

1/22/08 Lecture 4 4 Measuring center: the mean Mean = Average value The sample mean : If the n o bservations in a sample are , then their mean is x = + + + = i n n x n x x x x 1 2 1 / ) ... ( n x x x ,... , 2 1 x
1/22/08 Lecture 4 5 Measuring center: the median

1/22/08 Lecture 4 6 Example: Fuel economy (miles per gallon) for 2004 two-seater cars Look at the Highway mileage (w/o Honda Insight): Mean Median How about with Honda Insight? Mean Median What can you say?
1/22/08 Lecture 4 7 Example: Salary Survey of UNC Graduates Survey a certain number of graduates from UNC. A lot of departments are surveyed. Question: Which department produces students that earn the most on average ten years after they got their degrees? Answer: Geography!!!!?????? Michael Jordan

1/22/08 Lecture 4 8 Mean vs. Median Mean: easy to calculate easy to work with algebraically highly affected by outliers Not a resistant measure Median: can be time consuming to calculate more resistant to a few extreme observations (sometimes outliers) robust
