Unsupervised Learning - Association Analysis
Section 1
Introduction
Page
2
Section 2
Market Basket Analysis
Page
2
Section 3
Using Association Node
Page
6
Section 4
Understanding Association Rules
Page
9
Section 5
Association Analysis for Non-Binary Variables
Page 12
Section 6
Disassociation Analysis
Page 16
Section 7
Sequential Association Analysis
Page 20
Section 8
Case Study
Page 24
Appendix 1
A priori
Algorithm
Page 28
Appendix 2
SAS Code used in Example 3
Page 29
Appendix 3
Data Set “Income.txt”
Page 30
Appendix 4
SAS Code for Disassociation Node
Page 33
Appendix 5
Sample SAS Code for Case Study
Page 34
Appendix 6
References
Page 35
This
preview
has intentionally blurred sections.
Sign up to view the full version.
Section
1
Introduction
One significant advance in data mining at the end of the 20
th
century is that
“association
rules analysis” has emerged as a popular tool for mining a very large scale commercial
database (say, the number of variables is greater than10
4
and the number of observations
is greater than 10
8
).
Association mining attempts to construct simple “rules” (descriptive
statistics) that describe regions of relatively high density in a very large commercial
database.
When all variables in the database are binary, the association rules analysis can
also be referred to as “market basket analysis”.
For example, consider the sales database
of an on-line bookstore, where the objects represent customers and the attributes
represent authors and/or books. The rules to be discovered are the set of books most
frequently bought together by the customers. An example could be that, “15% of the
people who buy Dorian Pyle’s Data Preparation for Data Mining also buy Data Mining
Techniques by Berry and Linoff.” The retail stores can use the knowledge discovered
from the analysis for enhanced shelf placement, cross marketing, catalog design, and
consumer segmentation, etc.
Although association analysis has been applied to the retail
industry directly, it can be applied to other industries as well.
For example, it has been
used to predict faults in telecommunication networks.
In this session, we will discuss the theoretical foundation of market basket analysis in
Section 2.
We then use a small commercial banking data set to illustrate how to use
Association node in Enterprise Miner to obtain association rules in Section 3.
Since
association analysis typically produces a very large number of rules, understanding these
rules poses a very challenging data analysis task.
We will address this issue in Section 4.
In Section 5, we extend the use of association analysis to a data set with some non-binary
variables.
We will address Disassociation Analysis and Sequential Association Analysis
in Sections 6 and 7, respectively.
We conclude this session with a case study on showing
the miner to identify rules found in an association exercise.

This is the end of the preview.
Sign up
to
access the rest of the document.
- Spring '11
- Staff
- Data Mining, association analysis
-
Click to edit the document details