s3 - Stat 5102 Lecture Slides Deck 3 Charles J. Geyer...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota 1 Likelihood Inference We have learned one very general method of estimation: the method of moments. Now we learn another: the method of maximum likelihood. 2 Likelihood Suppose have a parametric statistical model specified by a PMF or PDF. Our convention of using boldface to distinguish between scalar data x and vector data x and a scalar parameter and a vector parameter becomes a nuisance here. To begin our discussion we write the PMF or PDF as f ( x ). But it makes no difference in likelihood inference if the data x is a vector. Nor does it make a difference in the fundamental definitions if the parameter is a vector. You may consider x and to be scalars, but much of what we say until further notice works equally well if either x or is a vector or both are. 3 Likelihood The PMF or PDF, considered as a function of the unknown parameter or parameters rather than of the data is called the likelihood function L ( ) = f ( x ) Although L ( ) also depends on the data x , we suppress this in the notation. If the data are considered random, then L ( ) is a random variable, and the function L is a random function. If the data are considered nonrandom, as when the observed value of the data is plugged in, then L ( ) is a number, and L is an ordinary mathematical function. Since the data X or x do not appear in the notation L ( ), we cannot distinguish these cases notationally and must do so by context. 4 Likelihood (cont.) For all purposes that likelihood gets used in statistics it is the key to both likelihood inference and Bayesian inference it does not matter if multiplicative terms not containing unknown parameters are dropped from the likelihood function. If L ( ) is a likelihood function for a given problem, then so is L * ( ) = L ( ) h ( x ) where h is any strictly positive real-valued function. 5 Log Likelihood In frequentist inference, the log likelihood function , which is the logarithm of the likelihood function, is more useful. If L is the likelihood function, we write l ( ) = log L ( ) for the log likelihood. When discussing asymptotics, we often add a subscript denot- ing sample size, so the likelihood becomes L n ( ) and the log likelihood becomes l n ( ). Note: we have yet another capital and lower case convention: capital L for likelihood and lower case l for log likelihood. 6 Log Likelihood (cont.) As we said before (slide 5), we may drop multiplicative terms not containing unknown parameters from the likelihood function. If L ( ) = h ( x ) g ( x, ) we may drop the term h ( x ). Since l ( ) = log h ( x ) + log g ( x, ) this means we may drop additive terms not containing unknown parameters from the log likelihood function....
View Full Document

This note was uploaded on 02/07/2012 for the course STAT 5102 taught by Professor Staff during the Spring '03 term at Minnesota.

Page1 / 113

s3 - Stat 5102 Lecture Slides Deck 3 Charles J. Geyer...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online