Market-Basket Analysis (MBA):
Determine which items are frequently bought together, or seldom
bought together. This can be used for:
catalogue design, inventory layout, and cross-marketing:
advertise or place complementary products togethe
Information Loss Function: The information loss for a particular
where i is the index for the correct class, and pi was the probability
predicted for the example being in class i. The information loss is therefore
high if the probability we
Data Mining Toolkit
Here is a short demo of XLMiner.
Let us use a simple example:
a bank sends mailers to its customers, offering a special deal on Personal
Loans. In its previous campaign, it got only about 9% positive response.
Objective: How to target
The confidence of a rule is the percentage of transactions mentioning
all items in the rule body that also mention all items in the rule head.
Confidence measures the strength of the rule: it measures how likely it is
that the set of items in the rule hea
Collaborative Filtering (Recommender systems): Amazon determines
which books are frequently purchased together and recommends
associated books or products to people who express interest in an item.
Healthcare: Studying the side-effects in patients with mu
Steps in constructing Tables/Charts:
Score the data: each score is an approximation of probability of
Sort the data in order of decreasing score, so that examples
that are most likely to respond are in the upper deciles
For each decile, work out
Apriori looks for all item sets (hence, all rules) that exceed the minimum
support and confidence thresholds. But note that, as we saw earlier,
support and confidence are not necessarily indicative of the
interestingness of the rule: lift or, alternativel
Quadratic Loss Function: the quadratic loss for a particular
where j is the number of classes, pj is the probability of the instance
being in class j as estimated by your model, and aj is 1 if the instance
is actually in class j but 0 if t
Precision & Recall are used in the field of Information Retrieval (IR) to
evaluate the quality of a query. Good queries return all examples in
the desired class, and no examples in other classes !
Precision = out of those we got, what proportion did we w
Accuracy is measured using the percentage of prediction errors.
Overall Accuracy versus Class-specific Error Rates: Note that the
overall accuracy of a model is usually not the best indicator of its
quality as overall accuracy doesnt account for the costs