lecture11-vector-classify-handout-6-per

# Given test doc evaluate it for membership in each

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: threshold   Despite this similarity, no)ceable performance diﬀerences   For separable problems, there is an inﬁnite number of separa)ng hyperplanes. Which one do you choose?   What to do for non ­separable problems?   Diﬀerent training methods pick diﬀerent hyperplanes   Classiﬁers more powerful than linear o{en don’t perform beker on text problems. Why? 35 36 6 Sec.14.2 Introduc)on to Informa)on Retrieval Introduc)on to Informa)on Retrieval Two ­class Rocchio as a linear classiﬁer Naive Bayes is a linear classiﬁer   Line or hyperplane deﬁned by: Sec.14.4   Two ­class Naive Bayes. We compute: M ∑w d ii =θ i =1   Decide class C if the odds is greater than 1, i.e., if the log odds is greater than 0.   So decision boundary is hyperplane:   For Rocchio, set: w = µ (c1 ) − µ (c 2 ) € θ = 0.5 × (| µ (c1 ) |2 − | µ (c 2 ) |2 ) [Aside for ML/stats people: Rocchio classiﬁca)on is a simpliﬁca)on of the classic Fisher Linear Discriminant where you don’t model the variance (or assume it is 37 spherical).] 38 € Sec.14.4 Introduc)on to Informa)on Retrieval Introduc)on to Informa)on Retrieval Sec.14.4 High Dimensional Data A nonlinear problem   Pictures like the one at right are absolutely misleading!   Documents are zero along almost all axes   Most document pairs are very far apart (i.e., not strictly orthogonal, but only share very common words and a few scakered others)   In classiﬁca)on te...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online