Final-Project-Instruction

Final-Project-Instruction - CAP4770 Introduction to Data...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
CAP4770 – Introduction to Data Mining Final Project Instruction Data: We use a gene data set as our data for the final project. This data set is in attributes-in-rows format, comma-separated values. It can be downloaded by following this link: http://users.cis.fiu.edu/~lli003/teaching/hw-sol/finalproject_datafiles.zip Username/Password: CAP4770/student The zip file contains three files: train.csv: training data , consisting of 69 instances with 7,070 attributes. train_class.txt: training classes , corresponding to the true labels for each instance in training data in the order. There are 5 classes in total, MED, MGL, RHB, JPA and EPD. test.csv: test data , consisting of 112 unlabeled instances with 7,070 attributes. Goal: Learn the best classifier from the training data and use it to predict the classes for test data. Due Date:
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
December 10 th , 2010 Submission: 1. Project report , describing how to establish your classifier step by step. Specifically, your report should include: 1) How to do data cleaning; 2) How to do feature selection; 3) How to train classifier. 2. Predicted result (in YourPatherId .txt file , one class per line in uppercase, as the same order of the test data) 3. Make sure that all the files you submit are zipped into one single file, named as “CAP4770_finalproject_firstname_lastname_patherid”. Important hints: 1. The training and testing data are all in the format of attribute-in-row. Probably you need to transform the data into the format of attribute-in-column so that the data can be appropriately fed into Weka. 2.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 08/29/2011 for the course CAP 4770 taught by Professor Staff during the Fall '08 term at FIU.

Page1 / 6

Final-Project-Instruction - CAP4770 Introduction to Data...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online