This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: 2 Density Estimation 2.1 Limit Theorems Assume you are a gambler and go to a casino to play a game of dice. As it happens, it is your unlucky day and among the 100 times you toss the dice, you only see 6 eleven times. For a fair dice we know that each face should occur with equal probability 1 6 . Hence the expected value over 100 draws is 100 6 17, which is considerably more than the eleven times that we observed. Before crying foul you decide that some mathematical analysis is in order. The probability of seeing a particular sequence of m trials out of which n are a 6 is given by 1 6 n 5 6 m n . Moreover, there are m n = m n ( m n ) different sequences of 6 and not 6 with proportions n and m n respectively. Hence we may compute the probability of seeing a 6 only 11 or less via Pr( X 11) = 11 i =0 p ( i ) = 11 i =0 100 i 1 6 i 5 6 100 i 7 . 0% (2.1) After looking at this figure you decide that things are probably reasonable. And, in fact, they are consistent with the convergence behavior of a sim- ulated dice in Figure 2.1 . In computing ( 2.1 ) we have learned something useful: the expansion is a special case of a binomial series. The first term Fig. 2.1. Convergence of empirical means to expectations. From left to right: em- pirical frequencies of occurrence obtained by casting a dice 10, 20, 50, 100, 200, and 500 times respectively. Note that after 20 throws we still have not observed a single 6, an event which occurs with only 5 6 20 2 . 6% probability. 37 38 2 Density Estimation counts the number of configurations in which we could observe i times 6 in a sequence of 100 dice throws. The second and third term are the probabilities of seeing one particular instance of such a sequence. Note that in general we may not be as lucky, since we may have con- siderably less information about the setting we are studying. For instance, we might not know the actual probabilities for each face of the dice, which would be a likely assumption when gambling at a casino of questionable reputation. Often the outcomes of the system we are dealing with may be continuous valued random variables rather than binary ones, possibly even with unknown range. For instance, when trying to determine the average wage through a questionnaire we need to determine how many people we need to ask in order to obtain a certain level of confidence....
View Full Document
This note was uploaded on 02/23/2012 for the course STAT 598 taught by Professor Staff during the Spring '08 term at Purdue University-West Lafayette.
- Spring '08