Unformatted text preview: ns. As the manner in which the
Web is used continues to expand, there is a continual
need to gure out new kinds of knowledge about user behavior that needs to be mined.
The quality of a mining algorithm can be measured
both in terms of how e ective it is in mining for knowledge and how e cient it is in computational terms.
There will always be a need to improve the performance of mining algorithms along both these dimensions.
Usage data collection on the Web is incremental
in nature. Hence, there is a need to develop mining algorithms that take as input the existing data,
mined knowledge, and the new data, and develop a
new model in an e cient manner.
Usage data collection on the Web is also distributed
by its very nature. If all the data were to be integrated
before mining, a lot of valuable information could be
extracted. However, an approach of collecting data
from all possible server logs is both non-scalable and
impractical. Hence, there needs to be an approach
where knowledge mined from various logs can be integrated together into a more comprehensive model. 6.3 Analysis of Mined Knowledge The output of knowledge mining algorithms is often
not in a form suitable for direct human consumption,
and hence there is a need to develop techniques and
tools for helping an analyst better assimilate it. Issues
that need to be addressed in this area include usage
analysis tools and interpretation of mined knowledge.
There is a need to develop tools which incorporate
statistical methods, visualization, and human factors
to help better understand the mined knowledge. Section 4 provided a survey of the current literature in
One of the open issues in data mining, in general,
and Web mining, in particular, is the creation of intelligent tools that can assist in the interpretation of
mined knowledge. Clearly, these tools need to have
speci c knowledge about the particular problem domain to do any more than ltering based on statistical
attributes of the discovered rules or patterns. In Web
mining, for example, intelligent agents could be developed that based on discovered access patterns, the
topology of the Web locality, and certain heuristics
derived from user behavior models, could give recommendations about changing the physical link structure
of a particular site. 7 Conclusion
The term Web mining has been used to refer to
techniques that encompass a broad range of issues.
However, while meaningful and attractive, this very broadness has caused Web mining to mean di erent
things to di erent people 21, 36], and there is a need
to develop a common vocabulary. Towards this goal
we proposed a de nition of Web mining, and developed a taxonomy of the various ongoing e orts related
to it. Next, we presented a survey of the research in
this area and concentrated on Web usage mining. We
provided a detailed survey of the e orts in this area,
even though the survey is short because of the area's
newness. We provided a general architecture of a system to do Web usage mining, and identi ed the issues
and problems in this area that require further research
and development. R...
View Full Document
This document was uploaded on 02/15/2014.
- Spring '14