{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

10.1.1.78.5947 - Selecting the Number of Bins in a...

Info icon This preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Selecting the Number of Bins in a Histogram: A Decision Theoretic Approach Kun He * Department of Mathematics University of Kansas Lawrence, KS 66045 Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 Appeared in Journal of Statistical Planning and Inference , Vol 61 (1997), 59-59. * Research supported in part by University of Kansas General Research Fund Research supported in part by NSF Grant SES 9201718 1
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Selecting the Number of Bins in a Histogram: A Decision Theoretic Approach Short Running Title Number of Bins in a Histogram ABSTRACT In this note we consider the problem of, given a sample, selecting the number of bins in a histogram. A loss function is introduced which reflects the idea that smooth distributions should have fewer bins than rough distri- butions. A stepwise Bayes rule, based on the Bayesian bootstrap, is found and is shown to be admissible. Some simulation results are presented to show how the rule works in practice. Key Words: histogram, Bayesian bootstrap, stepwise Bayes, admissibility, non-informative Bayes and entropy. AMS 1991 Subject Classification: Primary 62C15; Secondary 62F15, 62G07. 2
Image of page 2
1 Introduction The histogram is a statistical technique with a long history. Unfortunately there exist only a few explicit guidelines, which are based on statistical theory, for choosing the number of bins that appear in the histogram. Scott [8] gave a formula for the optimal histogram bin width which asymptotically minimizes the integrated mean squared error. Since the underlying density is usually unknown, it is not immediately clear how one should apply this in practice. Scott suggested using the Gaussian density as a reference standard, which leads to the data-based choice for the bin width of a × s × n - 1 / 3 , where a = 3 . 49 and s is an estimate of the standard deviation. (See also Terrell and Scott [10] and Terrell [9].) As Scott noted many authors advise that for real data sets histograms based on 5-20 bins usually suffice. Rudemo [7] suggested a cross-validation technique for selecting the number of bins. But such methods seem to have large sampling variation. In this note we will give a decision theoretic approach to the problem of choosing the number of bins in a histogram. We will introduce a loss function which incorporates the idea that smoother densities require less bins in their histogram estimates than rougher densities. A non-informative Bayesian approach, based on the Bayesian bootstrap of Rubin [6], will yield a data dependent decision rule for selecting the number of bins. We will then give a stepwise Bayes argument which proves the admissibility of this rule and shows the close connection of the rule to the notion of maximum likelihood, which also underlies the idea of a histogram. Finally we give some simulation results which show how our rule works in practice and compares to Scott’s rule. In section 2 we describe the rule and give the simulation results, while the proof of admissibility is deferred to section 3.
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern