Evaluation Measures We will evaluate your method on our secret test data which

Evaluation measures we will evaluate your method on

This preview shows page 3 - 5 out of 7 pages.

automatically remove duplicates from your recommendations. Evaluation Measures. We will evaluate your method on our secret test data (which will have exactly the same value of d and L as the training data i.e. exactly the same set of features and items) using two kinds of performance measures described below 1. Precision at k ([email protected]): For this, we will first choose some k [5]. Then we will ask, for every test user, what fraction of the top k recommendations given by your method 3
Image of page 3
for that user were actually liked by that user. This will be a fractional number [0 , 1]. Taking the average of this number across all test users will give us [email protected] of your method. 2. Macro Precision at k ([email protected]): For this, first choose some k [5]. Then, we will go over each of the items j [ L ] and for each item, look at all the test users that like that item. Then we will calculate the fraction of these users for whom your method did recommend item j in its top k recommendations. This will be a fractional number [0 , 1]. Taking the average of this number across all items will give us [email protected] of your method. The difference between [email protected] and [email protected] largely arise due to the presence of rare items in the dataset i.e. items that very few users like. You will see in your data itself that an average item is liked by just ˆ n = 40 users whereas there are a total of n = 10000 users! Whereas a method can get very high [email protected] by just recommending popular items to everyone (akin to recommending an iPhone to everyone), such a method may do poorly on [email protected] which gives a high score only if that method pays good attention to all items, not just the popular ones. Your Data. You have been provided in the assignment package, training data for n = 10000 users, each user i is represented as a d = 16385 dimensional feature vector x i . The feature vectors are sparse and the average number of non-zero features in a data point is only ˆ d 519. There are a total of L = 3400 items and each user is associated with a label vector y i ∈ { 0 , 1 } L . The label vectors are also quite sparse and an average user likes only ˆ L 13 . 5 items on an average. Routines have also been provided to you that read the data as well as show you how we will perform evaluation for your submitted code using [email protected] and [email protected] Caution : all matrices in this problem (feature and label) are sparse and should be stored in compressed representation ( csr_matrix from scipy ). The data loading routine given to you as a part of the assignment package will return compressed matrices. Be careful about densifying these matrices (e.g. using toarray() from scipy ) as you may run into memory issues. Code and Resource Repository. The following repository offers the state of the art in large scale multi-label learning with papers that describe various methods that do well on such prob- lems, as well as readily available code for some of these methods.
Image of page 4
Image of page 5

You've reached the end of your free preview.

Want to read all 7 pages?

  • Fall '16
  • Piyush Rai
  • Computer file

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask You can ask ( soon) You can ask (will expire )
Answers in as fast as 15 minutes