The prior probability p l denotes the probability

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: anism. It is supposed that the class L(di ) of document di has some relation to the words which appear in the document. This may be described by the conditional distribution p(t1 , . . . , tni | L(di )) of the ni words given the class. Then the Bayesian formula yields the probability of a class given the words of a document (Mitchell 1997) p ( L c | t1 , . . . , t ni ) = p ( t1 , . . . , t ni | L c ) p ( L c ) ∑ L ∈L p ( t 1 , . . . , t n i | L ) p ( L ) Note that each document is assumed to belong to exactly one of the k classes in L. The prior probability p( L) denotes the probability that an arbitrary document belongs to class L before its words are known. Often the prior probabilities of all classes may be taken to be equal. The conditional probability on the left is the desired posterior probability that the document with words t1 , . . . , tni belongs to class Lc . We may assign the class with highest posterior probability to our document. For document classification it turned out that th...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online