Wescield geico john hancock liberty 13 sangmi

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: gt;on c5 GEICO Na>onal c6 John Hancock Insurance Insurance c7 Liberty Insurance Company c8 Mutual Insurance of America Life Insurance Farmers Safeway Insurance Group c10 for the total number of candidates, N Allstate American Automobile Associa>on c9 idf t ,C Q Allstate c2 N = log | {c | ( a, v ) ∈ OD(c )} ∧ t ∈ tokenize(v ) | Name c1 •  Assigns higher weights to tokens that occurs less frequently in the scope of all candidate descrip>ons. Wescield GEICO John Hancock € Liberty … 13 Sangmi Lee Pallickara CS480 Principles of Data Management Spring 2013 14 Sangmi Lee Pallickara CS480 Principles of Data Management Example continued Spring 2013 Example continued CID Name Q V W c1 Allstate Allstate 0 0 c2 American Automobile Associa>on America 0 0 c3 American Na>onal Insurance Company Automobile 0 0 c4 Farmers Insurance Associa>on 0 0 c5 GEICO Na>onal 0 0 c6 John Hancock Insurance Insurance c7 Liberty Insurance Company 0 0 c8 •  Compute the similarity between the two strings s1=Farmers Insurance, s2 = Liberty Insurance Mutual Insurance of America Life Insurance Farmers 0 Wescield 0 0 0 0 Liberty c10 0 John Safeway Insurance Group 0 Hancock c9 GEICO 0 … 15 Sangmi Lee Pallickara CS480 Principles of Data Management Spring 2013 16 Sangmi Lee Pallickara CS480 Principles of Data Management Example continued Spring 2013 Example continued CID Sangmi Lee Pallickara W 0 0 American Automobile Associa>on America 0 0 American Na>onal Insurance Company Automobile 0 0 c4 Farmers Insurance Associa>on 0 0 c5 GEICO Na>onal 0 0 c6 John Hancock Insurance Insurance 0.23 0.23 c7 Liberty Insurance Company 0 0 c8 Mutual Insurance of America Life Insurance Farmers 1 0 GEICO 0 0 John 0 0 Hancock 0 0 Liberty tf-idf Farmers, c4 = (1+log10 1) x log10 (10/1) = 1 tf-idf Insurance, c4 = (1+log101) x log10 (10/6) ≈ 0.23 tf-idf Liberty, c7 = (1+log10 1) x log10 (10/1) = 1 tf-idf Insurance, c7 = (1+log101) x log10 (10/6) ≈ 0.23 V Allstate c3 •  •  •  •  Q Allstate c2 •  Six of the candidates contain the token “Insurance”. •  idf Insura...
View Full Document

This note was uploaded on 02/11/2014 for the course CS 480 taught by Professor Staff during the Spring '08 term at Colorado State.

Ask a homework question - tutors are online