Web Mining Information and Pattern Discovery on the World Wide Web

As the manner in which the web is used continues to

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ns. As the manner in which the Web is used continues to expand, there is a continual need to gure out new kinds of knowledge about user behavior that needs to be mined. The quality of a mining algorithm can be measured both in terms of how e ective it is in mining for knowledge and how e cient it is in computational terms. There will always be a need to improve the performance of mining algorithms along both these dimensions. Usage data collection on the Web is incremental in nature. Hence, there is a need to develop mining algorithms that take as input the existing data, mined knowledge, and the new data, and develop a new model in an e cient manner. Usage data collection on the Web is also distributed by its very nature. If all the data were to be integrated before mining, a lot of valuable information could be extracted. However, an approach of collecting data from all possible server logs is both non-scalable and impractical. Hence, there needs to be an approach where knowledge mined from various logs can be integrated together into a more comprehensive model. 6.3 Analysis of Mined Knowledge The output of knowledge mining algorithms is often not in a form suitable for direct human consumption, and hence there is a need to develop techniques and tools for helping an analyst better assimilate it. Issues that need to be addressed in this area include usage analysis tools and interpretation of mined knowledge. There is a need to develop tools which incorporate statistical methods, visualization, and human factors to help better understand the mined knowledge. Section 4 provided a survey of the current literature in this area. One of the open issues in data mining, in general, and Web mining, in particular, is the creation of intelligent tools that can assist in the interpretation of mined knowledge. Clearly, these tools need to have speci c knowledge about the particular problem domain to do any more than ltering based on statistical attributes of the discovered rules or patterns. In Web mining, for example, intelligent agents could be developed that based on discovered access patterns, the topology of the Web locality, and certain heuristics derived from user behavior models, could give recommendations about changing the physical link structure of a particular site. 7 Conclusion The term Web mining has been used to refer to techniques that encompass a broad range of issues. However, while meaningful and attractive, this very broadness has caused Web mining to mean di erent things to di erent people 21, 36], and there is a need to develop a common vocabulary. Towards this goal we proposed a de nition of Web mining, and developed a taxonomy of the various ongoing e orts related to it. Next, we presented a survey of the research in this area and concentrated on Web usage mining. We provided a detailed survey of the e orts in this area, even though the survey is short because of the area's newness. We provided a general architecture of a system to do Web usage mining, and identi ed the issues and problems in this area that require further research and development. R...
View Full Document

Ask a homework question - tutors are online