{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

05 Relation to Method of Moments and ML

# 05 Relation to Method of Moments and ML - Economics 241B...

This preview shows pages 1–3. Sign up to view the full content.

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Under Assumption 5 we have speci°ed the distribution of the error, so we can estimate the model parameters ° = ( ±; ² 2 ) with the principle of maximum likelihood. Under the assumption that the error is Gaussian, we will see that the OLS estimator B is equivalent to the MLE and the OLS estimator of ² 2 di/ers only sightly from its ML counterpart. Further, B achieves the Cramer-Rao lower bound. ML Principle The intuitive idea of the ML principle is to choose the value of the parameter that is most likely to have generated the data. Precisely, we assume that the probability distribution of a sample ( Y ) is a member of a family of functions indexed by ³ (this is described as parameterizing the distribution). This function, viewed as a function of the parameter vector ³ is called the likelihood function. In general, the likelihood function has the form of the joint density function L ( ³ j Y 1 = y 1 ; : : : ; Y n = y n ) = f Y 1 °°° Y n ( y 1 ; : : : ; y n ; ³ ) : For an i.i.d. sample of a continuous random variable, we form the likelihood function as L ( ³ j Y 1 = y 1 ; : : : ; Y n = y n ) = n Y t =1 f Y ( y t ; ³ ) : De°nition. The maximum likelihood estimator (MLE) of ³ , A ML , is the value of ³ (in the parameter space) that maximizes L ( ³ j Y 1 = y 1 ; : : : ; Y n = y n ) . Conditional versus Unconditional Likelihood For the regression model, we have a sample ( Y; X ) , whose joint density we para- meterize. Because the joint density is the product of a marginal density and a conditional density, we can write the joint density of the data as f ( y; x ; ³ ) = f ( y j x ; ° ) ° f ( x ; ) : The parameter vector of interest is ° . If we knew the parametric form of f ( x ; ) , then we could maximize the joint likelihood function. We cannot do this, as the classic model does not specify f ( x ; ) . However, if there is no functional relation between ° and (such as

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
the value of an element of depending on an element of ° ), then maximizing the joint likelihood is achieved by separately maximizing the conditional and marginal likelihoods. In such a case, the ML estimate of ° is obtained by maximizing the conditional likelihood alone. Log-Likelihood for the Regression Model As we have already seen, Assumptions 1.2 (strict exogeneity), Assumption 1.4 (spherical error variance) and Assumption 1.5 (Gaussian) together imply U j X ± N (0 ; ² 2 I n ) . Because Y = + U , we have Y j X ± N ° X±; ² 2 I n ± : The log-likelihood function, which is simpler to maximize, is ln L ² ~ ±; ~ ² 2 j ( Y 1 ; X 1 ) = ( y 1 ; x 1 ) ; : : : ; ( Y n ; X n ) = ( y n ; x n ) ³ = ² n 2 ln (2 ´ ) ² n 2 ln ~ ² 2 ² 1 2~ ² 2 ² Y ² X ~ ± ³ 0 ² Y ² X ~ ± ³ : (Because the likelihood function has the form of a joint density function, the like- lihood function takes values on the unit interval. Because the likelihood function takes values on the unit interval, the log-likelihood function is negative.) ML via Concentrated Likelihood We could maximize the log likelihood in two stages. First, maximize over ~ ± for any given ~ ² 2 . The ~ ± that maximizes the objective function could (but in this case, does not) depend on ~
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}