lecture11-vector-classify-handout-6-per

Given test doc evaluate it for membership in each

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: threshold   Despite this similarity, no)ceable performance differences   For separable problems, there is an infinite number of separa)ng hyperplanes. Which one do you choose?   What to do for non ­separable problems?   Different training methods pick different hyperplanes   Classifiers more powerful than linear o{en don’t perform beker on text problems. Why? 35 36 6 Sec.14.2 Introduc)on to Informa)on Retrieval Introduc)on to Informa)on Retrieval Two ­class Rocchio as a linear classifier Naive Bayes is a linear classifier   Line or hyperplane defined by: Sec.14.4   Two ­class Naive Bayes. We compute: M ∑w d ii =θ i =1   Decide class C if the odds is greater than 1, i.e., if the log odds is greater than 0.   So decision boundary is hyperplane:   For Rocchio, set: w = µ (c1 ) − µ (c 2 ) € θ = 0.5 × (| µ (c1 ) |2 − | µ (c 2 ) |2 ) [Aside for ML/stats people: Rocchio classifica)on is a simplifica)on of the classic Fisher Linear Discriminant where you don’t model the variance (or assume it is 37 spherical).] 38 € Sec.14.4 Introduc)on to Informa)on Retrieval Introduc)on to Informa)on Retrieval Sec.14.4 High Dimensional Data A nonlinear problem   Pictures like the one at right are absolutely misleading!   Documents are zero along almost all axes   Most document pairs are very far apart (i.e., not strictly orthogonal, but only share very common words and a few scakered others)   In classifica)on te...
View Full Document

Ask a homework question - tutors are online