cis6930fa11_QueryDrivenER

cis6930fa11_QueryDrivenER - Query-time Entity Resolution...

Info iconThis preview shows pages 1–17. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Query-time Entity Resolution Indrajit Bhattacharya Lise Getoor Presentation: Sean Goldberg Overview Introduction Entity Resolution and Queries: Formulation Collective Entity Resolution and Relational Clustering Analysis of Collective ER Using Relational Clustering Collective Resolution for Queries Adaptive Query Expansion Experiments Conclusions Introduction The Entity Resolution Problem ER Aliases Why is Entity Resolution Useful? Common Ways of Doing ER Limitations Query Time Entity Resolution (QTER) Basic Steps The Entity Resolution Problem Steven M. Smith Stephen Smith Samuel Smith S.M. Smith S. Smith Steve Smith S. Smith Sam Smith Stephen Smith ER Aliases Entity Resolution Deduplication Fuzzy Match Problem Record Linkage Object Consolidation Reference Reconciliation Why is ER Useful? Resolution of Duplicates Consolidation Maximize Information Content Disambiguation Identification Why is ER Useful? Animation Common Approaches to ER Attribute Similarity (traditional) Exact Match String Edit Distance Cosine Similarity TF-IDF Clustering or Feature Extraction Secondary Source Information Relational Similarity Affiliation or Co-Authorship Collective Resolution Limitations to Current Approaches Traditional approaches use only string matching on attributes Fields can be written in many different ways abbrev., spelling errors, nicknames Some schemas may be completely different Matching threshold difficult to determine Requires resolution of entire database Computationally hard Difficult to adjust to persistent data Query Time Entity Resolution Allow user to query unresolved or partially resolved database Resolve only relevant entities pertaining to specific queries on-the-fly Basic Steps Extract relevant records by a recursive expansion technique Select 'most informative' records using an adaptive algorithm Resolve selected records collectively using relational clustering Overview Introduction Entity Resolution and Queries: Formulation Collective Entity Resolution and Relational Clustering Analysis of Collective ER Using Relational Clustering Collective Resolution for Queries Adaptive Query Expansion Experiments Conclusions Entity Resolution and Queries: Formulation Definitions Running Example: Citation Matching ER Queries Problem Definition r1 r1.A1 r1.A2 E1 E1 E2 r2 r2.A1 r2.A2 r3 r3.A1 r3.A2 h1 Running Citation Example C Chen A Ansari L Li C Chen W Wang A Ansari W W Wang A Ansari A Mouse Immunity Model A Better Mouse Immunity Model Measuring Protein-bound Fluxetine Autoimmunity in Biliary Cirrhosis W Wang W Wang r1 r2 r3 r4 r5 r7 r6 r10 r9 r8 h1 h2 h3 h4 ER Queries Consider R = {r} and A = {a} Return entities based on attribute Q(R.A=a) Should only return unique entities Return records based on entity Q(R.A=r1.A) such that E(R)=E(r1) Overview...
View Full Document

Page1 / 75

cis6930fa11_QueryDrivenER - Query-time Entity Resolution...

This preview shows document pages 1 - 17. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online