Local Clusters Using Count Data

Local Clusters Using Count Data - Identification of Local...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Identification of Local Clusters for Count Data: A Model-Based Morans I Test Tonglin Zhang * and Ge Lin Purdue University and West Virginia University February 14, 2007 * Department of Statistics, Purdue University, 250 North University Street,West Lafayette, IN 47907-2066, Email: tlzhang@stat.purdue.edu Department of Geology and Geography, West Virginia University, Morgantown, WV 26506-6800, email: glin@wvu.edu Identification of Local Clusters for Count Data: A Model-Based Morans I Test Abstract We set out I DR as a loglinear model-based Morans I test for Poisson count data that resembles the Morans I residual test for Gaussian data. We evaluate its type I and type II error probabilities via simulations, and demonstrate its utility via a case study. When population sizes are hetero- geneous, I DR is effective in detecting local clusters by local association terms with an acceptable type I error probability. When used in conjunction with local spatial association terms in loglinear models, I DR can also indicate the existence of first-order global cluster that can hardly be removed by local spatial association terms. In this situation, I DR should not be directly applied for local cluster detection. In the case study of St. Louis homicides, we bridge loglinear model methods for parameter estimation to exploratory data analysis, so that a uniform association term can be defined with spatially varied contributions among spatial neighbors. The method makes use of exploratory tools such as Morans I scatter plots and residual plots to evaluate the magnitude of deviance residuals, and it is effective to model the shape, the elevation and the magnitude of a local cluster in the model-based test. Keywords: Cluster and clustering; deviance residual; Morans I ; permutation test; spatial autocorrelation; type I error probability. 1 Introduction Count and cross-tabulated frequency data are common in geographical analyses. Many spatial phenomena, such as births, deaths, crimes and species richness, can be counted by a spatial unit, either as a raw count or as a ratio over some exposure. Prior to the 1970s, count data were often converted to rate for statistical analyses because of limited computational power in categorical statistics. In the late 1970s, computationally expensive methods, such as loglinear models for 1 cross-tabulated data were introduced into social sciences and geography [15, 38], and they were quickly included in many statistical packages. In spatial statistical analyses, however, counts are still frequently converted to rate so that a testing method for continuous variables, such as Morans I [26, 27] or Getis-Ords G [20], can be directly applied. However, when population sizes are heterogeneous across spatial units, converting counts to rates often leads to variance inflation and biased type I error probabilities. Some propose to incorporate a population weight to the test statistics [29, 35], but the heterogeneity problem still remains [5]. Since a loglinear model can takestatistics [29, 35], but the heterogeneity problem still remains [5]....
View Full Document

This note was uploaded on 02/15/2012 for the course GEO 6938 taught by Professor Staff during the Summer '08 term at University of Florida.

Page1 / 28

Local Clusters Using Count Data - Identification of Local...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online