CorroborateLearnFacts

- ± Corroboration Strategies ± Lexicographical sorting of tokens ± Using synonyms of attribute names ± Not counting stopwords ± Matching of

Info iconThis preview shows pages 1–15. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Corroborate and learn facts from the web Shubin Zhao and Jonathan Betz Presentation by Yang Yu
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Problem Definition Known Facts Entity_Name Angelina Jolie Date of Birth June 4, 1975 More Facts Entity_Name Angelina Jolie Date of Birth June 4, 1975 Academy Awards ? Place of birth ?
Background image of page 2
3 Problem Definition Entity_Name Angelina Jolie Date of Birth June 4, 1975 Academy Awards ? Place of birth ? More Facts <tr> <td> Date of Birth</td> <td>June 4, 1975</td> </tr> <tr> <td> Academy Awards </td> <td>……</td> </tr>
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 GRAZER System Overview
Background image of page 4
5 Retrieve Relevant Pages ± The page contains the entity names ± Matching anchor text of a page with entity names ± Address ambiguity ± MAPREDUCE
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Corroborate Known Facts ± Avoid wrong corroboration on common facts
Background image of page 6
7 Corroborate Known Facts
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 10
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 12
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 14
Background image of page 15
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ± Corroboration Strategies ± Lexicographical sorting of tokens ± Using synonyms of attribute names ± Not counting stopwords ± Matching of attribute name is optional ± MAPREDUCE 8 Extract New Facts 9 Extract New Facts An example 10 Extract New Facts ± Bootstrapping 11 Experiments ± Experiments on Country Facts 12 Experiments ± Experiments on Wikipedia Facts 13 Experiments ± Experiments on Wikipedia Facts 14 Conclusions ± Difference with related work ± Wrappers are generated dynamically ± Using the content examples to locate and to label the extracted data ± Bootstrapping focused on structured text in HTML 15 Questions and Comments...
View Full Document

This note was uploaded on 08/06/2008 for the course CSE 450 taught by Professor Davison during the Spring '08 term at Lehigh University .

Page1 / 15

- ± Corroboration Strategies ± Lexicographical sorting of tokens ± Using synonyms of attribute names ± Not counting stopwords ± Matching of

This preview shows document pages 1 - 15. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online