{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

class_11_07_02

class_11_07_02 - Statistical Data Mining ORIE 474 Fall 2007...

This preview shows pages 1–10. Sign up to view the full content.

Statistical Data Mining ORIE 474 Fall 2007 Tatiyana Apanasovich 11/07/07

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Market basket analysis Market basket analysis uses the information about what customers purchase to provide the insight into who they are and why they make certain purchases and the merchandise by telling us which products tend to be purchases together and which are most amenable for promotion. The data mining technique most closely allied with market basket analysis is the automatic generation of association rules. Association rules represent pattern in the data without a specific target and whether pattern make sense is left to human interpretation.
Market basket analysis Some examples of potential applications: Items purchased on a credit card Optional services purchased by telecommunication customers Unusual combination of insurance claims

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Levels of market basket analysis Market basket analysis refers to a set of business problems related to understanding point-of-sale transaction data. Three levels Customer Orders Items
Association rules Association rules are among the most popular representations for local patterns in data mining. An association rule is a simple probabilistic statement about the co-occurrence of certain events in a database, and in particularly applicable to sparse transactional data sets Association rules are easy to understand

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Association rules Assume that all variable are binary Association rule is the statement of the form: IF A=1 and B=1 then C=1 with probability p p a =Prob(C=1|A=1,B=1), sometimes referred as accuracy or confidence p s =Prob(A=1,B=1,C=1) is referred as the support typically the goal is to find a rule such that accuracy and support are greater then some thresholds, for example p s >0.05 and p a >0.8 Lift= p(condition and result)/(p(condition)p(result)) when lift >1, the resulting rule is better at predicting the result than guessing based on item frequencies in data, when lift<1 the rule is worse then informal guessing
Examples of Association rules seen in data Wal-Mart customers who purchased Barbie dolls have a 60 percent likelihood of also buying one of three types of candy bars Customers who purchase maintenance agreement are very likely to purchase large appliances When a new hardware store opens, one of the most commonly sold items is toilet bowl cleaners.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Types of Association Rules Actionable Rule is the useful rule contains a high-quality, actionable information. Trivial rule is already known by anyone at all familiar with the business Inexplicable rule seems to have no explanation and do not suggest a course of action
Danger: Simpson’s Paradox In general, we expect that aggregates should evince same relationships as the categories, levels or individuals over which aggregate was formed.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}