1 Foundations of Artificial Intelligence Perceptrons and Optimal Hyperplanes CS472 – Fall 2007 Thorsten Joachims Example: Majority-Vote Function Definition: Majority-Vote Function f majority – N binary attributes, i.e. x {0,1} N – If more than N/2 attributes in x are true, then f majority (x)=1, else f majority (x)=-1. How can we represent this function as a decision tree? – Huge and awkward tree! Is there an “easier” representation of f majority ? Example: Spam Filtering Instance Space X: – Feature vector of word occurrences => binary features – N features (N typically > 50000) Target Concept c: – Spam (+1) / Ham (-1) Type of function to learn: – Set of Spam words S, Set of Ham words H – Classify as Spam (+1), if more Spam words than Ham words in example. Linear Classification Rules Hypotheses of the form – unbiased: –b i a s e d : – Parameter vector w, scalar b Hypothesis space H
