Ranking Problems 9.520 Class 09, 08 March 2006 Giorgos Zacharia

Supervised Ranking Problems Preference Modeling: Given a set of possible product configurations x 1 , x 2 ,…x d predict the most preferred one; predict the rating Information Retrieval: Given a query q, and set of candidate matches x 1 , x 2 ,…x d predict the best answer Information Extraction: Given a set of possible part of speech tagging choices, x 1 , x 2 ,…x d predict the most correct tag boundaries E.g “The_day_they_shot_John_Lennon/WE at the Dogherty_Arts_Center/WE” Multiclass classification: Given a set of possible class labels y 1 , y 2 ,…y d and confindense scores c 1 , c 2 ,…c d , predict the correct label
Types of information available • Preference modeling: – Metric based: • User rated configuration x with y=U (x) i i i – Choice based: • Given choices x 1 , x 2 ,…x d , the user chose x f – Prior information about the features: • Cheaper is better • Faster is better •e t c

Types of information available • Information Retrieval: – Metric based: • Users clicked on link x i with a frequency y=U (x) i i – Choice based: • Given choices x 1 , x 2 ,…x d , the user clicked on x f – Prior information about the features: • Keyword matches (the more the better) • Unsupervised similarity scores (TFIDF) •e t c
Types of information available • Information Extraction: Choice based: • Given tagging choices x 1 , x 2 ,…x d , the hand labeling chose x f Prior information about the features: Unsupervised scores • Multiclass: Choice based: Given vectors the confidence scores c 1 , c 2 ,…c d for class labels 1,2,…d the correct label was y f.. . The confidence scores may be coming from set of weak classifiers, and/or OVA comparisons. Prior information about the features: The higher the confidence score the more likely to represent the correct label.

(Semi-)Unsupervised Ranking Problems Learn relationships of the form: Class A is closer to B, than it is to C We are given a set of l labeled comparisons for a user, and a set of u seemingly-unrelated comparisons from other users.
