Introducon to informaon retrieval introducon to

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: with the Jaccard coefficient Introduc)on to Informa)on Retrieval Ch. 6 Take 1: Jaccard coefficient §༊  A commonly used measure of overlap of two sets A and B is the Jaccard coefficient §༊  jaccard(A,B) = |A ∩ B| / |A ∪ B| §༊  jaccard(A,A) = 1 §༊  jaccard(A,B) = 0 if A ∩ B = 0 §༊  A and B don’t have to be the same size. §༊  Always assigns a number between 0 and 1. Introduc)on to Informa)on Retrieval Ch. 6 Jaccard coefficient: Scoring example §༊  What is the query- document match score that the Jaccard coefficient computes for each of the two documents below? §༊  Query: ides of march §༊  Document 1: caesar died in march §༊  Document 2: the long march Introduc)on to Informa)on Retrieval Ch. 6 Issues with Jaccard for scoring §༊  It doesn’t consider term frequency (how many *mes a term occurs in a document) §༊  Rare t...
View Full Document

This document was uploaded on 02/14/2014.

Ask a homework question - tutors are online