This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 12, NO. 3, JUNE 2008 377 An Evolutionary Algorithm Approach to Optimal Ensemble Classifiers for DNA Microarray Data Analysis Kyung-Joong Kim , Member, IEEE , and Sung-Bae Cho , Senior Member, IEEE Abstract— In general, the analysis of microarray data requires two steps: feature selection and classification. From a variety of fea- ture selection methods and classifiers, it is difficult to find optimal ensembles composed of any feature-classifier pairs. This paper pro- poses a novel method based on the evolutionary algorithm (EA) to form sophisticated ensembles of features and classifiers that can be used to obtain high classification performance. In spite of the ex- ponential number of possible ensembles of individual feature-clas- sifier pairs, an EA can produce the best ensemble in a reasonable amount of time. The chromosome is encoded with real values to decide the weight for each feature-classifier pair in an ensemble. Experimental results with two well-known microarray datasets in terms of time and classification rate indicate that the proposed method produces ensembles that are superior to individual classi- fiers, as well as other ensembles optimized by random and greedy strategies. Index Terms— Classification, DNA microarray, ensemble, evolu- tionary algorithm (EA), feature selection, real-valued encoding. I. INTRODUCTION D NA MICROARRAYS measure the expression levels of thousands of genes simultaneously . This measurement process consists of either monitoring each gene multiple times or using a single time point in different states (for example, when dealing with diseases or types of tumors) . It is impor- tant to identify functionally related genes or to classify samples by using informative genes. In this paper, we focus on the latter case: the classification of DNA microarray data. In this kind of classification, classifiers receive input vec- tors from the feature selection step to make decisions. However, it is difficult to choose appropriate feature selection methods and classifiers because there are so many candidates. Cho and Won explored seven feature selection methods and four clas- sifiers in three benchmark datasets to systematically evaluate the performance of both the feature selection methods, as well as the machine learning classifiers . The researchers used to test different feature-classifier pairs, using various datasets, Manuscript received March 21, 2006; revised October 9, 2006, January 22, 2007, and June 1, 2007. This work was supported by MIC, Korea, under ITRC IITA-2007-(C1090-0701-0046). K. J. Kim was with the Department of Computer Science, Yonsei University, Seoul 120-749, Korea (e-mail: email@example.com). He is now with the De- partment of Mechanical and Aerospace Engineering, Cornell University, Ithaca, NY 14853 USA....
View Full Document
This note was uploaded on 07/08/2011 for the course CS 101 taught by Professor Khliu during the Spring '11 term at Xiamen University.
- Spring '11
- Machine Learning