Unformatted text preview: .
Since usually such transaction databases contain
extremely large amounts of data, current association
rule discovery techniques try to prune the search space
according to support for items under consideration.
Support is a measure based on the number of occurrences of user transactions within transaction logs.
Discovery of such rules for organizations engaged
in electronic commerce can help in the development
of e ective marketing strategies. But, in addition, association rules discovered from WWW access logs can
give an indication of how to best organize the organization's Web space.
The problem of discovering sequential patterns 35,
52] is to nd inter-transaction patterns such that the
presence of a set of items is followed by another item
in the time-stamp ordered transaction set. In Web
server transaction logs, a visit by a client is recorded
over a period of time. The time stamp associated
with a transaction in this case will be a time interval
which is determined and attached to the transaction
during the data cleaning or transaction identi cation
processes. The discovery of sequential patterns in Web
server access logs allows Web-based organizations to
predict user visit patterns and helps in targeting advertising aimed at groups of users based on these patterns. By analyzing this information, the Web mining
system can determine temporal relationships among
data items such as the following:
30% of clients who visited /company/products,
had done a search in Yahoo, within the past week
on keyword or
60% of clients who placed an online order in
/company/product1, also placed an online order
in /company/product4 within 15 days.
w Another important kind of data dependency that
can be discovered, using the temporal characteristics
of the data, are similar time sequences. For example,
we may be interested in nding common characteristics of all clients that visited a particular le within
the time period 1 2 ]. Or, conversely, we may be interested in a time interval (within a day, or within a
week, etc.) in which a particular le is most accessed.
Discovering classi cation rules 20, 54] allows one
to develop a pro le of items belonging to a particular group according to their common attributes. This
pro le can then be used to classify new data items
that are added to the database. In Web usage mining,
classi cation techniques allow one to develop a pro le
for clients who access particular server les based on
demographic information available on those clients, or
based on their access patterns. For example, classi cation on WWW access logs may lead to the discovery
of relationships such as the following:
t t clients from state or government agencies who
visit the site tend to be interested in the page
50% of clients who placed an online order in
/company/product2, were in the 20-25 age group
and lived on the West Coast.
Clustering analysis 24, 38] allows one to group together clients or data items that have similar characteristics. C...
View Full Document
- Spring '14
- Data Mining, Web page, World Wide Web, web usage mining