Libra-sigir-wkshp-99

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ake recommendations, this approach raises concerns about privacy and access to proprietary customer data. Learning individualized pro les from descriptions of examples content-based recommending 3 , on the other hand, allows a system to uniquely characterize each patron without having to match their interests to someone else's. Items are recommended based on information about the item itself rather than on the preferences of other users. This also allows for the possibility of providing explanations that list content features that caused an item to be recommended; potentially giving readers con dence in the system's recommendations and insight into their own preferences. Finally, a content-based approach can allow users to provide initial subject information to aid the system. Machine learning for text-categorization has been applied to content-based recommending of web pages 26 and newsgroup messages 16 ; however, to our knowledge has not previously been applied to book recommending. We have been exploring content-based book recommending by applying automated text-categorization methods to semistructured text extracted from the web. Our current prototype system, Libra Learning Intelligent Book Recommending Agent, uses a database of book information extracted from web pages at Amazon.com. Users provide 1 10 ratings for a selected set of training books; the system then learns a pro le of the user using a Bayesian learning algorithm and produces a ranked list of the most recommended additional titles from the system's catalog. As evidence for the promise of this approach, we present initial experimental results on several data sets of books randomly selected from particular genres such as mystery, science, literary ction, and science ction and rated by di erent users. We use standard experimental methodology from machine learning and present results for several evaluation metrics on independent test data including rank correlation coe cient and average rating of top-ranked books. These experiments are based on ratings from random samplings of items and we discuss problems with previous experiments that employ skewed samples of user-selected examples to evaluate performance. The remainder of the paper is organized as follows. Section 2 provides an overview of the system including the algorithm used to learn user pro les. Section 3 presents results of our initial experimental evaluation of the system. Section 4 discusses topics for further research, and section 5 presents our conclusions on the advantages and promise of content-based book recommending. 2 SYSTEM DESCRIPTION 2.1 Extracting Information and Building a Database First, an Amazon subject search is performed to obtain a list of book-description URL's of broadly relevant titles. Libra then downloads each of these pages and uses a simple pattern-based information-extraction system to extract data about each title. Information extraction IE is the task of locating speci c pieces of information from...
View Full Document

This document was uploaded on 09/12/2013.

Ask a homework question - tutors are online