AnatomyOfASearchEngine - The Anatomy of a LargeScale...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Click to edit Master subtitle style 8/1/11 The Anatomy of a Large- Scale Hypertextual Web By Sergey Brin and Lawrence Page Presented by Joshua Haley Zeyad Zainal Michael Lopez Michael Galletti Britt Phillips Jeff Masson
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8/1/11
Background image of page 2
8/1/11 Searching in the 90’s Search Engine Technology had to deal with huge growths. 0 5 10 Web Pages Indexed 1994 v. 1997 0 5 10 Queries Per Day 1994 v. 1997
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8/1/11 Google will Scale They wanted a search engine that: Has fast crawling capabilities Use Storage Space Efficiently Process Indexes fast Handles Queries fast They Had to Deal with Scaling Difficulties Disk Speeds and OS robustness not
Background image of page 4
8/1/11 The Google Goals Improve Search Quality Remove Junk Results (Prioritizing of Results) Academic Search Engine Research Create Literature on the subject of Databases Gather Usage Data Data bases can support research
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8/1/11 System Features Two important features that help it produce high precision results: PageRank Anchor Text
Background image of page 6
8/1/11 PageRank Graph structure of hyperlinks hadn’t been used by other search engines Graph of 518 million hyperlinks Text matching using page titles performs well after pages are prioritized Similar results when looking at entire pages
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8/1/11 PageRank Formula Not all pages linking to others are counted equally PR(A) = (1-d) + d (PR(T1)/C(T1) + . .. + PR(Tn)/C(Tn)) A: page T1…Tn: pages linking to it C(A): pages linking out of it d: “damping factor”
Background image of page 8
8/1/11 Intuitive Justification A page can have a high PageRank if many pages link to it Or if a high PageRank’d page links to it (eg: Yahoo News) The page wouldn’t be linked to if it wasn’t high quality, or it had a broken link PageRank handles these cases by propagating the weights of different
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Anchor Text Anchors provide more accurate descriptions than the page itself. Anchors exist for documents that
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 36

AnatomyOfASearchEngine - The Anatomy of a LargeScale...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online