IR-part2

Unformatted text preview: with the Jaccard coeﬃcient Introduc)on to Informa)on Retrieval Ch. 6 Take 1: Jaccard coeﬃcient §༊  A commonly used measure of overlap of two sets A and B is the Jaccard coeﬃcient §༊  jaccard(A,B) = |A ∩ B| / |A ∪ B| §༊  jaccard(A,A) = 1 §༊  jaccard(A,B) = 0 if A ∩ B = 0 §༊  A and B don’t have to be the same size. §༊  Always assigns a number between 0 and 1. Introduc)on to Informa)on Retrieval Ch. 6 Jaccard coeﬃcient: Scoring example §༊  What is the query- document match score that the Jaccard coeﬃcient computes for each of the two documents below? §༊  Query: ides of march §༊  Document 1: caesar died in march §༊  Document 2: the long march Introduc)on to Informa)on Retrieval Ch. 6 Issues with Jaccard for scoring §༊  It doesn’t consider term frequency (how many *mes a term occurs in a document) §༊  Rare t...
