featureSelectionDNAMethyCancerClassification_01bioinfo

2001 weston et al 2001 brown et al 2000 gaasterland

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: assify mRNA expression data (Ben-Dor et al., 2001; Weston et al., 2001; Brown et al., 2000; Gaasterland & Bekiranov, 2000). The major problem of all classification algorithms for methylation and expression data analysis alike is the high dimension of input space compared to the small number of available samples. Although the support vector machine is designed to overcome this problem it still suffers from these extreme conditions. Therefore feature selection is of crucial importance for good performance (Blum & Langley, 1997; Weston et al., 2001; Ben-Dor et al., 2001) and we give special consideration to it by comparing several methods on our methylation data. The data set (Adorj´ n et al., 2001) consists of cell a lines and primary tissue obtained from patients with acute lymphoblastic leukemia (ALL) or acute myeloid leukemia (AML). A total of 17 ALL and 8 AML samples were included. The methylation status of these samples was evaluated at 81 CpG dinucleotide positions located in CpG rich regions of the promoters, intronic and coding sequences of 11 genes. These were randomly selected from a panel of genes representing different pathways associated with tumor genesis. Two of the 11 selected genes are located on the X-chromosome. The rest of the paper is organized as follows. In Section 2, we give a short description of the process S157 F.Model et al. used for generating the methylation data. Especially we demonstrate how the process can be validated and calibrated. In Section 3, we give a short introduction to the support vector machine and describe our experimental setting. In Section 4, we address the problem of feature selection by introducing and comparing several methods. Finally we conclude in Section 5 with a discussion of the potential impact of methylation analysis and future directions. MICROARRAY-BASED METHYLATION ANALYSIS In order to allow sequence specific distinction of methylated from unmethylated states of CpG dinucleotides by hybridization analyses, total DNA from all samples was bisulphite treated converting all unmethylated cytosines to uracil whereas methylated cytosines were conserved (Frommer et al., 1992). Regions of interest were then amplified by PCR using fluorescently labeled primers converting originally unmethylated CpG dinucleotides to TG and conserving originally methylated CpG sites. PCR primers were designed complementary to DNA segments containing no CpG dinucleotides. This allowed unbiased amplification of both methylated and unmethylated alleles in one reaction. All PCR products performed on an individual sample were mixed and hybridized to glass slides carrying for each CpG position a pair of immobilized oligonucleotides. Each of these detection oligonucleotides was designed to hybridize to the bisulphite converted sequence around one CpG site which was either originally unmethylated (TG) or methylated (CG). Hybridization conditions were selected to allow the detection of the single nucleotide differences between the TG and CG variants. Ratios for the two signals were calculated based on comparison of intensity of the fluorescent signals. The sensitivity of the method for detection of methylation...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online