This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 11.3 Estimation Theory 405 11.3.2 Maximum Likelihood Estimation Consider the following problem. A box contains a number of red and blue balls. Suppose that it is known that the ratio of the numbers is 3 : 1, but it is not known whether the red or blue balls are more numerous. Thus, the probability of drawing a red ball from the box is either 1 / 4 or 3 / 4. If we draw m balls from the box with replacement, we know that the number K of red balls drawn has the following binomial distribution: p K ( k ) = parenleftbigg m k parenrightbigg p k ( 1 p ) m k k = , 1 , 2 ,..., m where p is the probability of drawing a red ball and has the value p = 1 / 4 or p = 3 / 4. The idea behind maximum likelihood estimation is to obtain the best estimate of p . More formally, the maximum likelihood estimation of the parame ters 1 , 2 ,..., n , which characterize a random variable X , chooses the value(s) that make(s) the observed values x 1 , x 2 ,..., x n most probable. Suppose X is a random variable whose distribution depends on a single pa rameter . Let x 1 , x 2 ,..., x n be an observed random sample. If X is discrete, the probability that a random sample consists of exactly these values is given by L ( ) = L ( ; x 1 , x 2 ,..., x n ) = p X ( x 1 ; ) p X ( x 2 ; )... p X ( x n ; ) L ( ) is called the likelihood function and is a function of ; that is, its value de pends on both the selected sample values and the choice of . If X is continuous with PDF f X ( x ) , then the likelihood function is defined by L ( ) = L ( ; x 1 , x 2 ,..., x n ) = f X ( x 1 ; ) f X ( x 2 ; )... f X ( x n ; ) The maximum likelihood estimate of is the value of that maximizes the value of L ( ) . If L ( ) is a differentiable function, then a necessary condition for L ( ) to have a maximum value is that L ( ) = The partial derivative is used because L ( ) depends on both and the sample values x 1 , x 2 ,..., x n . If is the value of that maximizes L ( ) , then is called the maximum likelihood estimator . Also, if the likelihood function contains k pa rameters such that L ( ; x 1 , x 2 ,..., x n ) = L ( 1 , 2 ,..., k ; x 1 , x 2 ,..., x n ) = n productdisplay i = 1 f X ( x i ; 1 , 2 ,..., k ) 406 Chapter 11 Introduction to Statistics then the point where the likelihood function is a maximum is the solution to the following k equations: 1 L ( 1 , 2 ,..., k ; x 1 , x 2 ,..., x n ) = 2 L ( 1 , 2 ,..., k ; x 1 , x 2 ,..., x n ) = = k L ( 1 , 2 ,..., k ; x 1 , x 2 ,..., x n ) = In many cases it is more convenient to work with the logarithm of the likelihood function....
View Full
Document
 Fall '07
 Carlton

Click to edit the document details