Deutsch_2003_Evolutionary algorithms for finding optimal gene sets in microarray prediction

Deutsch_2003_Evolutionary algorithms for finding optimal gene sets in microarray prediction

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
BIOINFORMATICS Vol. 19 no. 1 2003 Pages 45–52 Evolutionary algorithms for finding optimal gene sets in microarray prediction J. M. Deutsch University of California, Santa Cruz, USA Received on August 8, 2001; revised on December 5, 2001; July 10, 2002; accepted on July 12, 2002 ABSTRACT Motivation: Microarray data has been shown recently to be efficacious in distinguishing closely related cell types that often appear in different forms of cancer, but is not ye tpractical clinically. However, the data might be used to construct a minimal set of marker genes that could then be used clinically by making antibody assays to diagnose aspecific type of cancer. Here a replication algorithm is used for this purpose. It evolves an ensemble of predictors, all using different combinations of genes to generate a set of optimal predictors. Results: We apply this method to the leukemia data of the Whitehead/MIT group that attempts to differentially diagnose two kinds of leukemia, and also to data of Khan et al. to distinguish four different kinds of childhood cancers. In the latter case we were able to reduce the number of genes needed from 96 to less than 15, while at the same time being able to classify all of their test data perfectly. We also apply this method to two other cases, Diffuse large B-cell lymphoma data (Shipp et al. , 2002), and data of Ramaswamy et al. on multiclass diagnosis of 14 common tumor types. Availability: http://stravinsky.ucsc.edu/josh/gesses/ Contact: josh@physics.ucsc.edu INTRODUCTION cDNA and oligonucleotide microarrays have been used with great success to distinguish cell types from each other, and hence has promising applications to cancer diagnosis. While the histopathology of two cells may appear very similar, their clinical behavior, such as their response to drugs can be drastically different. The use of microarrays has been shown in many cases to provide clear differential diagnosis rivaling or surpassing other methods and leads to a clustering of data into different forms of a disease (DeRisi et al. , 1996; Alon et al. , 1999; Perou et al. , 1999; Zhu et al. , 1998; Wang et al. , 1999; Schummer et al. , 1999; Zhang et al. , 1997; Alizadeh et al. , 2000; Golub et al. , 1999; Khan et al. , 2001). Many approaches have been used to classify microarray data. These include the use of artificial neural networks (Khan et al. , 2001; Furey et al. , 2000), logistic regression (Li and Yang, 2002), support vector machines (Brown et al. , 2000; Furey et al. , 2000), coupled two-way clustering (Getz et al. , 2000), weighted votes—neighborhood anal- ysis (Golub et al. , 1999) and feature selection techniques (Xing et al. , 2001). For much of the data all these tech- niques appear to give similar results and their performance improves as the amount and quality of data increases.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 8

Deutsch_2003_Evolutionary algorithms for finding optimal gene sets in microarray prediction

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online