{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

06137387.pdf - 2011 11th IEEE International Conference on...

Info icon This preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Twitter Trending Topic Classification Kathy Lee, Diana Palsetia, Ramanathan Narayanan, Md. Mostofa Ali Patwary, Ankit Agrawal, and Alok Choudhary Department of Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 USA Email: { kml649, drp925, ran310, mpatwary, ankitag, choudhar } @eecs.northwestern.edu Abstract —With the increasing popularity of microblogging sites, we are in the era of information explosion. As of June 2011 , about 200 million tweets are being generated every day. Although Twitter provides a list of most popular topics people tweet about known as Trending Topics in real time, it is often hard to understand what these trending topics are about. Therefore, it is important and necessary to classify these topics into general categories with high accuracy for better information retrieval. To address this problem, we classify Twitter Trending Topics into 18 general categories such as sports , politics , technology , etc. We experiment with 2 approaches for topic classification; (i) the well-known Bag-of-Words approach for text classification and (ii) network-based classification. In text-based classification method, we construct word vectors with trending topic defi- nition and tweets, and the commonly used tf-idf weights are used to classify the topics using a Naive Bayes Multinomial classifier. In network-based classification method, we identify top 5 similar topics for a given topic based on the number of common influential users. The categories of the similar topics and the number of common influential users between the given topic and its similar topics are used to classify the given topic using a C5.0 decision tree learner. Experiments on a database of randomly selected 768 trending topics (over 18 classes) show that classification accuracy of up to 65% and 70% can be achieved using text-based and network-based classification modeling respectively. Keywords -Social Networks, Twitter, Topic Classification I. I NTRODUCTION Twitter 1 is an extremely popular microblogging site, where users search for timely and social information such as breaking news, posts about celebrities, and trending topics. Users post short text messages called tweets, which are limited by 140 characters in length and can be viewed by user’s followers. Anyone who chooses to have other’s tweets posted on one’s timeline is called a follower. Twit- ter has been used as a medium for real-time information dissemination and it has been used in various brand cam- paigns, elections, and as a news media. Since its launch in 2006, the popularity of its use has been dramatically increasing. As of June 2011, about 200 million tweets are being generated every day [1]. When a new topic becomes popular on Twitter, it is listed as a trending topic , which may take the form of short phrases (e.g., Michael Jackson) 1 http://www.twitter.com or hashtags (e.g., #election ). What the Trend 2 provides a regularly updated list of trending topics from Twitter. It is very interesting to know what topics are trending and what people in other parts of the world are interested in. However,
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern