10.1.1.78.5947

10.1.1.78.5947 - Selecting the Number of Bins in a Histogram A Decision Theoretic Approach Kun He Department of Mathematics University of Kansas

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Selecting the Number of Bins in a Histogram: A Decision Theoretic Approach Kun He * Department of Mathematics University of Kansas Lawrence, KS 66045 Glen Meeden † School of Statistics University of Minnesota Minneapolis, MN 55455 Appeared in Journal of Statistical Planning and Inference , Vol 61 (1997), 59-59. * Research supported in part by University of Kansas General Research Fund † Research supported in part by NSF Grant SES 9201718 1 Selecting the Number of Bins in a Histogram: A Decision Theoretic Approach Short Running Title Number of Bins in a Histogram ABSTRACT In this note we consider the problem of, given a sample, selecting the number of bins in a histogram. A loss function is introduced which reflects the idea that smooth distributions should have fewer bins than rough distri- butions. A stepwise Bayes rule, based on the Bayesian bootstrap, is found and is shown to be admissible. Some simulation results are presented to show how the rule works in practice. Key Words: histogram, Bayesian bootstrap, stepwise Bayes, admissibility, non-informative Bayes and entropy. AMS 1991 Subject Classification: Primary 62C15; Secondary 62F15, 62G07. 2 1 Introduction The histogram is a statistical technique with a long history. Unfortunately there exist only a few explicit guidelines, which are based on statistical theory, for choosing the number of bins that appear in the histogram. Scott [8] gave a formula for the optimal histogram bin width which asymptotically minimizes the integrated mean squared error. Since the underlying density is usually unknown, it is not immediately clear how one should apply this in practice. Scott suggested using the Gaussian density as a reference standard, which leads to the data-based choice for the bin width of a × s × n- 1 / 3 , where a = 3 . 49 and s is an estimate of the standard deviation. (See also Terrell and Scott [10] and Terrell [9].) As Scott noted many authors advise that for real data sets histograms based on 5-20 bins usually suffice. Rudemo [7] suggested a cross-validation technique for selecting the number of bins. But such methods seem to have large sampling variation. In this note we will give a decision theoretic approach to the problem of choosing the number of bins in a histogram. We will introduce a loss function which incorporates the idea that smoother densities require less bins in their histogram estimates than rougher densities. A non-informative Bayesian approach, based on the Bayesian bootstrap of Rubin [6], will yield a data dependent decision rule for selecting the number of bins. We will then give a stepwise Bayes argument which proves the admissibility of this rule and shows the close connection of the rule to the notion of maximum likelihood, which also underlies the idea of a histogram. Finally we give some simulation results which show how our rule works in practice and compares to Scott’s rule. In section 2 we describe the rule and give the simulation results, while the proof of admissibility is deferred to section 3.the proof of admissibility is deferred to section 3....
View Full Document

This note was uploaded on 12/05/2011 for the course GRC 421 taught by Professor Dougspeer during the Fall '11 term at Cal Poly.

Page1 / 17

10.1.1.78.5947 - Selecting the Number of Bins in a Histogram A Decision Theoretic Approach Kun He Department of Mathematics University of Kansas

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online