Unformatted text preview: il is classified to be spam and we miss it, we would possibly receive a big loss.
Therefore, we should try our best to avoid the first kind errors. To achieve this, we should adjust the code of our
naï Bayes classifier function. If an email is likely to be both spam and non-spam, we should just make it a nonve
( ) ( ) Thus we get: So what we need to do next is just to find
, Therefore, we can get the conclusion that the optimal
correspond to it is the largest eigen value of
. is an eigen vector of , and the eigen value (b)
The problem can be transfer into: ( ) Page 2 of 5 We can see that this is the standard form to do principal component analysis, where the covariance matrix changes
into the matrix standing for between-class variance relative to within-class variance,
, which is
symmetric as well.
According to the equation (1) above, we can c...
View Full Document
- Spring '14
- E-mail, Singular value decomposition, Naive Bayes classifier