1 Geography 231 Introduction to Geospatial Methods Topic 12: Descriptive Statistics II Goals To take a look at the limitations and potential To take a look at the limitations and potential dangers of statistics Selecting the “correct” measure of central tendency Spatial data and descriptive statistics MAUP Look at some basic descriptive spatial statistics

2 The Danger of Statistics Descriptive statistics are simplifications of samples and/or populations Errors – both intentional and unintentional – in the creation and the interpretation of statistics can lead to biased or inaccurate conclusions “There are three kinds of lies: lies, damned lies, and statistics” – Mark Twain Selecting the Proper Measure of Central Tendency The arithmetic mean has an advantage in that it’s sensitive to changes of any value in the data set, plus, it’s widely understood. HOWEVER, it is highly sensitive to outliers Household Income (\$) \$21,000 \$21 000 Mean: \$71,428 Median: \$26 000 \$21,000 \$22,000 \$26,000 \$27,500 \$32,500 \$349,000 Median: \$26,000 Mode: \$21,000
3 The “Correct” Central Tendency, Cont… If a distribution is unimodal and symmetric, the mean, median, and mode are the same If a distribution is skewed, the measures start to disperse, with the mean being the most affected by outliers If a distribution is bi-modal or multi-modal, the effectiveness of ANY single measure of central tendency is reduced Central Tendency and Data Levels Nominal data: mode only (modal class) Nominal data: mode only (modal class) Ordinal data: mode, median Interval, ratio: mean, median, mode

4 Spatial Data and Descriptive Statistics The use of descriptive statistics on spatial The use of descriptive statistics on spatial data is fraught with issues Boundary delineation Scale Spatial aggregation The modifiable areal unit problem! Impact of Boundary Delineation Statistics describing household income Study area boundary ‘A’ includes just inner-city neighborhoods Study area boundary ‘B’ contains both inner-city and suburban neighborhoods Would you expect the same (or similar) summary statistics from both study areas? Why or why not?
