assignment4 - (d) Are some of the attributes categorical...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining Assignment #4 CSC592 – Fall ‘05 Problem Statement Consider the following sets of data available from the course website: The usnews data set. This dataset contains college data taken from the U.S. News & World Report's Guide to America's Best Colleges. You are to construct association rules over the whole data set using the apriori algorithm available in the associate tab of Weka. The data set represents raw problem domain data which you will need to first translate into the ARFF format and transform in order to build association rules. Once you have the data in the appropriate format you might want to consider the following data preparation questions: (a) Should instances with missing values be deleted? (b) Should attributes with missing values be deleted? (c) Should missing values be coded in a special way and then used in the data mining task?
Background image of page 1
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: (d) Are some of the attributes categorical even though their levels are expressed as numbers? (e) Etc. The apriori algorithm can only handle categorical data, therefore you will need to discretize the attributes. For this you should use the “discretize” filter available in Weka. Your report should discuss the following questions: (1) What are your top five rules? (2) What is their support/confidence? (3) Are they intuitive from the domain perspective or completely surprising? (4) How do the association rules change if you increase/decrease the number of bins (levels) in the attributes? Handing in your assignment Write a description of your experiments and your findings and submit this together with the discovered association rules. Hand in your typewritten report in class on Friday, October 21 st ....
View Full Document

This note was uploaded on 10/03/2011 for the course CSC 592 taught by Professor Staff during the Spring '11 term at Rhode Island.

Ask a homework question - tutors are online