2web mining overview - CS 345A Data Mining Lecture 1...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
    CS 345A Data Mining Lecture 1 Introduction to Web Mining
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
    What is Web Mining?    Discovering useful information from the  World-Wide Web and its usage patterns
Background image of page 2
    Web Mining v. Data Mining Structure (or lack of it) Textual information and linkage structure Scale Data generated per day is comparable to  largest conventional data warehouses Speed Often need to react to evolving usage patterns  in real-time (e.g., merchandising)
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
    Web Mining topics Web graph analysis Power Laws and The Long Tail Structured data extraction Web advertising  Systems Issues
Background image of page 4
    Web Mining topics Web graph analysis Power Laws and The Long Tail Structured data extraction Web advertising  Systems Issues
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
    Size of the Web Number of pages Technically, infinite Much duplication (30-40%) Best estimate of “unique” static HTML pages  comes from search engine claims Until last year, Google claimed 8 billion(?), Yahoo  claimed 20 billion Google recently announced that their index  contains 1 trillion pages How to explain the discrepancy?
Background image of page 6
    The web as a graph Pages = nodes, hyperlinks = edges
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 30

2web mining overview - CS 345A Data Mining Lecture 1...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online