Exercise 6.1.1: Suppose there are 100 items, numbered 1 to 100, and also 100 baskets, also numbered 1
to 100. Item i is in basket b if and only if i divides b with no remainder. Thus, item 1 is in all the baskets,
item 2 is in all fty of the even-numbered

! Exercise 6.1.4: This question involves data from which nothing interesting can be learned about
frequent itemsets, because there are no sets of items that are correlated. Suppose the items are
numbered 1 to 10, and each basket is constructed by includin

! Exercise 6.1.8: Prove that in the data of Exercise 6.1.4 there are no interesting association rules; i.e., the
interest of every association rule is 0.
6.2. MARKET BASKETS AND THE A-PRIORI ALGORITHM 209
6.2 Market Baskets and the A-Priori Algorithm
We s

can obtain through a MapReduce formulation. Finally, we discuss briey how to nd frequent itemsets in
a data stream.
201
202 CHAPTER 6. FREQUENT ITEMSETS
6.1 The Market-Basket Model
The market-basket model of data is used to describe a common form of manym

200 CHAPTER 5. LINK ANALYSIS
HITS equations in the way they do for PageRank, so no taxation scheme is necessary.
5.7 References for Chapter 5
The PageRank algorithm was rst expressed in [1]. The experiments on the structure of the Web, which
we used to ju

4. Z. Gyongi, H. Garcia-Molina, and J. Pedersen, Combating link spam with trustrank, Proc. 30th Intl.
Conf. on Very Large Databases, pp. 576 587, 2004.
5. T.H. Haveliwala, Ecient computation of PageRank, Stanford Univ. Dept. of Computer Science
technical

universities could serve as the trusted set. This technique avoids sharing the tax in the PageRank
calculation with the large numbers of supporting pages in spam farms and thus preferentially reduces
their PageRank.
Spam Mass: To identify spam farms, we