This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Important Terms • Population – Total set of subjects in which we are interested • Sample – A subset of the population for which we have data • Subject – Entities we measure (individuals) Histogram Interpretation • How many total students sampled? 60 + 82 + 60 + 41 = 243 • Which class has highest / lowest frequency? What are those frequencies? Highest: “100109” with 82 Lowest: “120129” with 41 • How many students have an IQ between 110 and 129? 60 + 41 = 101 StemAndLeaf Plot • A bar chart on its side • “Stem” is all digits except the last one • Last digit is the “leaf” • Ascending order • No commas • If nothing in a row, write the row, but leave it blank Example (HW 2.12.2) eBay selling prices 199 210 210 223 225 225 225 228 232 235 Sampling Methods • Simple Random Sampling – Each subject everywhere has an equally likely chance of being selected – Often done with a random number table – Choosing a company somewhere in the U.S. • Systematic – Selecting every “ kth” subject – Surveying every 10 th person we meet downtown • Convenience – Individuals are easily found (e.g. internet surveys) – Often the “laziest” way, so less reliable answers Sampling Methods • Stratifed Sampling – Taking some subjects From all possible groups • Cluster Sampling – Taking all subjects From some possible groups Skewness Outliers • The mean is sensitive to outliers. • The median is resistant to outliers. • When outliers are present, best to use median as measure of central tendency. • Example: average selling price of homes in the U.S. Standard Deviation • The average distance between any data point and the mean of the data. • Measures how much/little the data distribution is spread out. Summary Stats Interpretation • Mean – Average of the data set • Median (also called Q2) – About 50% of data lie below (and above) this value. • Range – Difference between maximum and minimum • Max & Min – Highest and lowest points in data set • Q1 and Q3 – 25% and 75% percentiles • Interquartile Range (IQR) – Difference between Q3 and Q1 BoxPlot (HW 2.52.6) Distribution of taxes (in cents) • Minimum = 2.6 Q3 = 105 • Q1 = 31 Maximum = 206 • Median = 55 • Construct a boxplot for this data. • What proportion of states have taxes… – Greater than 31 cents? – Greater than $1.05 (105 cents) ? • Find the range and the interquartile range (IQR). BoxPlot (HW 2.52.6) Greater than 31 cents: .75 Greater than $1.05: .25 Range = max  min = 206  2.6 = 203.4 IQR = Q3 – Q1 = 105 – 31 = 74 IQR = range for the middle half of the data. BoxPlot Outlier Test (HW 2.52.6) • Any point lying above Q3 + 1.5 x IQR is an outlier....
View
Full
Document
This note was uploaded on 10/18/2011 for the course STAT 2000 taught by Professor Smith during the Spring '08 term at UGA.
 Spring '08
 smith
 Statistics

Click to edit the document details