This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: 11.3 Estimation Theory 405 11.3.2 Maximum Likelihood Estimation Consider the following problem. A box contains a number of red and blue balls. Suppose that it is known that the ratio of the numbers is 3 : 1, but it is not known whether the red or blue balls are more numerous. Thus, the probability of drawing a red ball from the box is either 1 / 4 or 3 / 4. If we draw m balls from the box with replacement, we know that the number K of red balls drawn has the following binomial distribution: p K ( k ) = parenleftbigg m k parenrightbigg p k ( 1 − p ) m − k k = , 1 , 2 ,..., m where p is the probability of drawing a red ball and has the value p = 1 / 4 or p = 3 / 4. The idea behind maximum likelihood estimation is to obtain the “best” estimate of p . More formally, the maximum likelihood estimation of the parame- ters θ 1 , θ 2 ,..., θ n , which characterize a random variable X , chooses the value(s) that make(s) the observed values x 1 , x 2 ,..., x n most probable. Suppose X is a random variable whose distribution depends on a single pa- rameter θ . Let x 1 , x 2 ,..., x n be an observed random sample. If X is discrete, the probability that a random sample consists of exactly these values is given by L ( θ ) = L ( θ ; x 1 , x 2 ,..., x n ) = p X ( x 1 ; θ ) p X ( x 2 ; θ )... p X ( x n ; θ ) L ( θ ) is called the likelihood function and is a function of θ ; that is, its value de- pends on both the selected sample values and the choice of θ . If X is continuous with PDF f X ( x ) , then the likelihood function is defined by L ( θ ) = L ( θ ; x 1 , x 2 ,..., x n ) = f X ( x 1 ; θ ) f X ( x 2 ; θ )... f X ( x n ; θ ) The maximum likelihood estimate of θ is the value of θ that maximizes the value of L ( θ ) . If L ( θ ) is a differentiable function, then a necessary condition for L ( θ ) to have a maximum value is that ∂ ∂ θ L ( θ ) = The partial derivative is used because L ( θ ) depends on both θ and the sample values x 1 , x 2 ,..., x n . If ˆ θ is the value of θ that maximizes L ( θ ) , then ˆ θ is called the maximum likelihood estimator . Also, if the likelihood function contains k pa- rameters such that L ( θ ; x 1 , x 2 ,..., x n ) = L ( θ 1 , θ 2 ,..., θ k ; x 1 , x 2 ,..., x n ) = n productdisplay i = 1 f X ( x i ; θ 1 , θ 2 ,..., θ k ) 406 Chapter 11 Introduction to Statistics then the point where the likelihood function is a maximum is the solution to the following k equations: ∂ ∂ θ 1 L ( θ 1 , θ 2 ,..., θ k ; x 1 , x 2 ,..., x n ) = ∂ ∂ θ 2 L ( θ 1 , θ 2 ,..., θ k ; x 1 , x 2 ,..., x n ) = ··· = ··· ∂ ∂ θ k L ( θ 1 , θ 2 ,..., θ k ; x 1 , x 2 ,..., x n ) = In many cases it is more convenient to work with the logarithm of the likelihood function....
View Full Document
- Fall '07
- Null hypothesis, Statistical hypothesis testing, Maximum likelihood, Estimation theory, Type I and type II errors