DATA_MINING_REPORT.pdf - Project Report for Data Mining Course Abstract Newly emerging technologies of Data Mining and Artificial Intelligence have

DATA_MINING_REPORT.pdf - Project Report for Data Mining...

This preview shows page 1 - 2 out of 7 pages.

Project Report for Data Mining Course Abstract Newly emerging technologies of Data Mining and Arti- ficial Intelligence have recently brought more opportu- nities and possibilities to medical as well as biological studies. Technology of biochips, known as collections of microscopic DNA spots attached to a solid surface, is one most the most beneficial research fields that lever- age deep-learnings ability of pattern recognition. In this work, we implement several traditional ma- chine learning methods as well as a deep-learning based method to tackle the multiclassification task of biochips, and compare their performance on ubiquitous metrics. Specifically, we utilize Principle Component Analysis (PCA) to reduce dimension of the raw data, before fur- ther training process. For classical methods, we use Lo- gistic Regression, KNN, SVM and Random Forests re- spectively to address the classification task. In the deep- learning method, we apply a Long Short-Term Memory (LSTM) neutral network to mine the contextual infor- mation hidden in data sequence. We experiment sub- stantially over the given dataset Gene Chip Data . Introduction Machine learning methods such as Logistic Regression and SVM (Support Vector Machine) have long been applied to data-driven studies, learning specified data patterns in form of a set of parameters which reflect some ground truth. Deep learning, on the other hand, has been recently brought to re- searchers insights and revived for its unprecedentedly strong power of data representation. This representation ability is proven so utility that studies of all fields start to set it into their own context and usage. Technology of biochips, known as collections of micro- scopic DNA spots attached to a solid surface, is one most the most beneficial research fields that leverage deep-learnings ability of pattern recognition. The complex encoding mech- anisms hidden in gene data can be, in a sense, captured by a properly designed deep learning framework. In this work, we implement several traditional machine learning methods as well as a deep-learning based method to tackle the mul- ticlassification task of biochips, and compare their perfor- mance on ubiquitous metrics. Copyright c 2019, Association for the Advancement of Artificial Intelligence (). All rights reserved. The remainder of this report is organized as follows. In section 2, we introduce the dimension reduction method (PCA) we used in this work. Section 3 will illustrate four classical machine learning methods we implemented in this work. In section 4, we introduce the deep learning frame- work we propose and implemented, which is based on a Long Shot-Term Memory RNN system. Section 5 presents the experiment details as well as results in this work. In sec- tion 6, we conclude this work and propose some potential future work.
Image of page 1
Image of page 2

You've reached the end of your free preview.

Want to read all 7 pages?

  • Summer '18

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask You can ask (will expire )
Answers in as fast as 15 minutes
A+ icon
Ask Expert Tutors