9783540378815-c1

9783540378815-c1 - Preface The rapid growth of the Web in...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Preface The rapid growth of the Web in the last decade makes it the largest pub- licly accessible data source in the world. Web mining aims to discover use- ful information or knowledge from Web hyperlinks, page contents, and us- age logs. Based on the primary kinds of data used in the mining process, Web mining tasks can be categorized into three main types: Web structure mining, Web content mining and Web usage mining. Web structure min- ing discovers knowledge from hyperlinks, which represent the structure of the Web. Web content mining extracts useful information/knowledge from Web page contents. Web usage mining mines user access patterns from usage logs, which record clicks made by every user. The goal of this book is to present these tasks, and their core mining al- gorithms. The book is intended to be a text with a comprehensive cover- age, and yet, for each topic, sufficient details are given so that readers can gain a reasonably complete knowledge of its algorithms or techniques without referring to any external materials. Four of the chapters, structured data extraction, information integration, opinion mining, and Web usage mining, make this book unique. These topics are not covered by existing books, but yet they are essential to Web data mining. Traditional Web mining topics such as search, crawling and resource discovery, and link analysis are also covered in detail in this book. Although the book is entitled Web Data Mining , it also includes the main topics of data mining and information retrieval since Web mining uses their algorithms and techniques extensively. The data mining part mainly consists of chapters on association rules and sequential patterns, supervised learning (or classification), and unsupervised learning (or clus- tering), which are the three most important data mining tasks. The ad- vanced topic of partially (semi-) supervised learning is included as well. For information retrieval, its core topics that are crucial to Web mining are described. This book is thus naturally divided into two parts. The first part, which consists of Chaps. 2–5, covers data mining foundations. The second part, which contains Chaps. 6–12, covers Web specific mining. Two main principles have guided the writing of this book. First, the ba- sic content of the book should be accessible to undergraduate students, and yet there are sufficient in-depth materials for graduate students who plan to
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
pursue Ph.D. degrees in Web data mining or related areas. Few assump- tions are made in the book regarding the prerequisite knowledge of read- ers. One with a basic understanding of algorithms and probability concepts should have no problem with this book. Second, the book should examine the Web mining technology from a practical point of view. This is impor- tant because most Web mining tasks have immediate real-world applica- tions. In the past few years, I was fortunate to have worked directly or in- directly with many researchers and engineers in several search engine and
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 65

9783540378815-c1 - Preface The rapid growth of the Web in...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online