Log p lcs c1 c2 sim lin c1 c2 2 log p lcs c1

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ing both c1 and c2 Dan Jurafsky Using informa1on content for similarity: the Resnik method Philip Resnik. 1995. Using Informa-on Content to Evaluate Seman-c Similarity in a Taxonomy. IJCAI 1995. Philip Resnik. 1999. Seman-c Similarity in a Taxonomy: An Informa-on ­Based Measure and its Applica-on to Problems of Ambiguity in Natural Language. JAIR 11, 95 ­130. •  The similarity between two words is related to their common informa-on •  The more two words have in common, the more similar they are •  Resnik: measure common informa-on as: •  The informa-on content of the most informa-ve (lowest) subsumer (MIS/LCS) of the two nodes •  simresnik(c1,c2) = -log P( LCS(c1,c2) ) Dan Jurafsky Dekang Lin method Dekang Lin. 1998. An Informa-on ­Theore-c Defini-on of Similarity. ICML •  Intui-on: Similarity between A and B is not just what they have in common •  The more differences between A and B, the less similar they are: •  Commonality: the more A and B have in common, the more similar they are •  Difference: the more differences between A and B, the less similar •  Commonality: IC(common(A,B)) •  Difference: IC(descrip-on(A,B) ­IC(common(A,B)) Dan Jurafsky Dekang Lin similarity theorem •  The similarity between A and B is measured by the ra-o between the amount of informa-on needed to state the commonality of A and B and the informa-on needed to fully describe what A and B are IC (common( A, B)) simLin ( A, B) ! IC (description( A, B)) •  Lin (altering Resnik) defines IC(common(A,B)) as 2 x informa-on of the LCS simLin (c1, c2 ) = 2 log P( LCS (c1, c2 )) log P(c1 ) + log P(c2 ) Dan Jurafsky Lin similarity func1on simLin ( A, B) = simLin (hill, coast ) = 2 log P( LCS (c1, c2 )) log P(c1 ) + log P(c...
View Full Document

This document was uploaded on 02/14/2014.

Ask a homework question - tutors are online