test 5 review guide

test 5 review guide - Important Terms Population Histogram...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Important Terms • Population – Total set of subjects in which we are interested • Sample – A subset of the population for which we have data • Subject – Entities we measure (individuals) Histogram Interpretation How many total students sampled? 60 + 82 + 60 + 41 = 243 Which class has highest / lowest frequency? What are those frequencies? Highest: “100-109” with 82 Lowest: “120-129” with 41 How many students have an IQ between 110 and 129? 60 + 41 = 101 Stem-And-Leaf Plot A bar chart on its side “Stem” is all digits except the last one Last digit is the “leaf” Ascending order No commas If nothing in a row, write the row, but leave it blank Example (HW 2.1-2.2) eBay selling prices 199 210 210 223 225 225 225 228 232 235 Sampling Methods Simple Random Sampling – Each subject everywhere has an equally likely chance of being selected – Often done with a random number table – Choosing a company somewhere in the U.S. Systematic – Selecting every “ k -th” subject – Surveying every 10 th person we meet downtown Convenience – Individuals are easily found (e.g. internet surveys) – Often the “laziest” way, so less reliable answers Sampling Methods Stratifed Sampling Taking some subjects From all possible groups Cluster Sampling Taking all subjects From some possible groups Skewness
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Outliers • The mean is sensitive to outliers. • The median is resistant to outliers. • When outliers are present, best to use median as measure of central tendency. • Example: average selling price of homes in the U.S. Standard Deviation (Quiz Question) • The average distance between any data point and the mean of the data. • Measures how much/little the data distribution is spread out. Summary Stats Interpretation Mean – Average of the data set Median (also called Q2) – About 50% of data lie below (and above) this value. Range – Difference between maximum and minimum – Highest and lowest points in data set Q1 and Q3 – 25% and 75% percentiles Interquartile Range (IQR) – Difference between Q3 and Q1 Box-Plot (HW 2.5-2.6) Greater than 31 cents: .75 Greater than $1.05: .25 Range = max - min = 206 - 2.6 = 203.4 IQR = Q3 – Q1 = 105 – 31 = 74 IQR = range for the middle half of the data. Box-Plot Outlier Test (HW 2.5-2.6) Any point lying above Q3 + 1.5 x IQR is an outlier. Any point lying below Q1 – 1.5 x IQR is also an outlier. Are there any outliers on this box-plot? Q1 - 1.5 x IQR = 256 - 1.5 x (1105 - 256) = -1017.5 Because there are no points beneath this cutoff, we have no lower outliers. Q3 + 1.5 x IQR = 1105 + 1.5 x (1105 - 256) = 2378.5 Because the max is greater than this cutoff (320,000 > 2378.5), we have an upper outlier.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/08/2012 for the course STAT 2000 taught by Professor Smith during the Fall '08 term at UGA.

Page1 / 17

test 5 review guide - Important Terms Population Histogram...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online