IR-part2

# Introducon to informaon retrieval introducon to

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: with the Jaccard coeﬃcient Introduc)on to Informa)on Retrieval Ch. 6 Take 1: Jaccard coeﬃcient §༊  A commonly used measure of overlap of two sets A and B is the Jaccard coeﬃcient §༊  jaccard(A,B) = |A ∩ B| / |A ∪ B| §༊  jaccard(A,A) = 1 §༊  jaccard(A,B) = 0 if A ∩ B = 0 §༊  A and B don’t have to be the same size. §༊  Always assigns a number between 0 and 1. Introduc)on to Informa)on Retrieval Ch. 6 Jaccard coeﬃcient: Scoring example §༊  What is the query- document match score that the Jaccard coeﬃcient computes for each of the two documents below? §༊  Query: ides of march §༊  Document 1: caesar died in march §༊  Document 2: the long march Introduc)on to Informa)on Retrieval Ch. 6 Issues with Jaccard for scoring §༊  It doesn’t consider term frequency (how many *mes a term occurs in a document) §༊  Rare t...
View Full Document

## This document was uploaded on 02/14/2014.

Ask a homework question - tutors are online