This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: 11.3 Estimation Theory 405 11.3.2 Maximum Likelihood Estimation Consider the following problem. A box contains a number of red and blue balls. Suppose that it is known that the ratio of the numbers is 3 : 1, but it is not known whether the red or blue balls are more numerous. Thus, the probability of drawing a red ball from the box is either 1 / 4 or 3 / 4. If we draw m balls from the box with replacement, we know that the number K of red balls drawn has the following binomial distribution: p K ( k ) = parenleftbigg m k parenrightbigg p k ( 1 p ) m k k = , 1 , 2 ,..., m where p is the probability of drawing a red ball and has the value p = 1 / 4 or p = 3 / 4. The idea behind maximum likelihood estimation is to obtain the best estimate of p . More formally, the maximum likelihood estimation of the parame- ters 1 , 2 ,..., n , which characterize a random variable X , chooses the value(s) that make(s) the observed values x 1 , x 2 ,..., x n most probable. Suppose X is a random variable whose distribution depends on a single pa- rameter . Let x 1 , x 2 ,..., x n be an observed random sample. If X is discrete, the probability that a random sample consists of exactly these values is given by L ( ) = L ( ; x 1 , x 2 ,..., x n ) = p X ( x 1 ; ) p X ( x 2 ; )... p X ( x n ; ) L ( ) is called the likelihood function and is a function of ; that is, its value de- pends on both the selected sample values and the choice of . If X is continuous with PDF f X ( x ) , then the likelihood function is defined by L ( ) = L ( ; x 1 , x 2 ,..., x n ) = f X ( x 1 ; ) f X ( x 2 ; )... f X ( x n ; ) The maximum likelihood estimate of is the value of that maximizes the value of L ( ) . If L ( ) is a differentiable function, then a necessary condition for L ( ) to have a maximum value is that L ( ) = The partial derivative is used because L ( ) depends on both and the sample values x 1 , x 2 ,..., x n . If is the value of that maximizes L ( ) , then is called the maximum likelihood estimator . Also, if the likelihood function contains k pa- rameters such that L ( ; x 1 , x 2 ,..., x n ) = L ( 1 , 2 ,..., k ; x 1 , x 2 ,..., x n ) = n productdisplay i = 1 f X ( x i ; 1 , 2 ,..., k ) 406 Chapter 11 Introduction to Statistics then the point where the likelihood function is a maximum is the solution to the following k equations: 1 L ( 1 , 2 ,..., k ; x 1 , x 2 ,..., x n ) = 2 L ( 1 , 2 ,..., k ; x 1 , x 2 ,..., x n ) = = k L ( 1 , 2 ,..., k ; x 1 , x 2 ,..., x n ) = In many cases it is more convenient to work with the logarithm of the likelihood function....
View Full Document
- Fall '07