# 24 - Conventional Ranking Models Content relevance Boolean,...

This preview shows pages 1–7. Sign up to view the full content.

Conventional Ranking Models Content relevance Boolean, vector space, probabilistic, language model, . .. Page importance Link analysis: PageRank, HITS, . .. Query log mining, clickthroughs, . .. Machine learning for IR ranking? We’ve looked at methods for classifying documents using supervised machine learning classiFers Naive Bayes, Rocchio, kNN, SVMs, . .. Surely we can also use machine learning to rank the documents displayed in search results? Sounds like a good idea => “machine-learned relevance” or “learning to rank”

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Learning to rank algorithms +#%,H**1) BIA!+ 455?G !JA B<=>=+ 455PG +#%,D") B=CA! 455PG =+<@A B<=>=+ 455QG !#0RM#+#%, BD=S< 455QG T\$#%, B<=>=+ 4552G >H+#%, B<=>=+ 4552G UH+#%, BD=S< 4552G AS+#%, B=CA! 4552G AK+#%, BD=S< 4552G -M#+#%, B<=>=+ 4552G <@A8A-S B<=>=+ 4552G <*N)+#%, B!+O=+ 4552G >S+#%, B!+O=+ 4552G CC- B<=>=+ 4552G AV+ B<=>=+ 4552G <@A <)\$:K):\$" BIA!+ 455PG D"1)"M +#%,"\$ B<=>=+ 455QG B7]=< EF6FG ]-S8HSA B=CA! 455?G !"#\$%&%' )* \$")\$&"L#. &%N* B<CC EFFPG
Simple example: Using classifcation For ad hoc IR Collect a training corpus oF (q,d,r) triples Relevance r is binary Document is represented by a Feature vector x =( α , ω ) : α is cosine similarity; ω is minimum query window size Query term proximity is a very important new weighting Factor

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Simple example: Using classifcation For ad hoc IR A linear score function is then: Score(d,q) = Score( α , ω ) = a α + b ω + c And the linear classifer is: Decide relevant iF Score(q,d) > θ ... this is exactly like text classifcation Simple example: Using classifcation For ad hoc IR ! " # \$ % ! " R R R R R R R R R R R N N N N N N N N N N '()*+*,- +/012)(
Extending the model We can generalize this to classifer Functions over more Features We can use methods we have seen previously For learning the linear classifer weights Machine learning for IR ranking This “good idea” has been actively researched and actively deployed at major web search engines in the last 5 years Why didn’t it happen earlier? Modern supervised ML has been around For about 15 years Naive Bayes has been around For about 45 years!

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine learning for IR ranking There’s some truth to the fact that the IR community wasn’t very connected to the ML community But there were a whole bunch of precursors:
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 01/21/2011 for the course CSCP 689 taught by Professor James during the Spring '10 term at Texas A&M.

### Page1 / 16

24 - Conventional Ranking Models Content relevance Boolean,...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online