Data Mining Hands On-Lab Assignment 2 - Todor Dimov CISC...

This preview shows page 1 - 3 out of 3 pages.

Todor Dimov 10/26/18 CISC 4631 Data Mining Lab Assignment 2 Task 1: The first step I took in finding a classification model for the text classification data set automobile.arff, which classifies text on whether the text is related to the auto-industry or not, is on Weka’s preprocessing stage. In this stage I decide to remove the five instance of an irrelevant label called “first_done_at” manually through edit. Since dealing with text data, I next split the text into words to analyze their frequencies by using the filter ‘StringtoWordVector’ and the following parameters: • IDFTransform — True • TFTransform — True • lowerCaseTokens — True • OutputWordCounts — True • tokenizer — set it to: .,;:’”()?!/ - ¿¡&#
Image of page 1

Subscribe to view the full document.

Afterwards, I decided to get rid of most of numbers as labels because I didn’t believe they were important, except prices and year dates. In the classification stage, I used the NaiveBayesMultinomial with fairly good results: With a model that correctly classified 99.7807 percent of instances correctly, I conclude
Image of page 2
Image of page 3
  • Fall '19

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes