nonlinearly mapped into a potentially higher dimensional
feature space by a mapping function
: xi → (xi ).
Because the SVM algorithm in its dual formulation
uses only the inner product between elements of the
input space, the knowledge of the kernel function
k (xi , x j ) = (xi ) · (x j ) is sufﬁcient to train the SVM.
Every hybridization experiment was at least 3 times repeated and the results
averaged. Feature selection for DNA methylation CD1A CpG2 66% 33% 100% ELK1 CpG6
CD63 CpG1
AR CpG4 0% p<0.1 TUBB2 CpG4
CDK4 CpG3
CSNK2B CpG10
AR CpG5 −6 −4 −2
log(Ratio) 0 2 ELK1 CpG5
ELK1 CpG12
AR CpG2 F8−5 p<0.01 Probability Density
0.0
1.0
2.0 F8−3 ELK1 CpG8 33% 66% ELK1 CpG11
MYCN CpG2 100%
0% AR CpG1 p<0.001 Probability Density
0.0
1.0
2.0 ELK1 CpG9 ELK1 CpG2
ELK1 CpG3
AR CpG3
ELK1 CpG1 −6 −4 −2
0
log(Ratio) 2 4 (a) 1 15 Female Male 22 (b) Fig. 1. Validation of measurements. a) Quantiﬁcation of methylation measurements for two CpG dinucleotides. A series of hybridizations was
performed with mixtures of artiﬁcially up and downmethylated DNA fragments of the factor VIII exon 14 gene. Down and upmethylated
DNA fragments were mixed at ratios: 0:3, 1:2, 2:1, 3:0, representing a methylation status of 100 %, 66 %, 33 % and 0 %, respectively. For
the 4 kinds of compounds 59, 36, 40, 63 identical slides were made. The logratio of the CG and the TG detection oligomer hybridization
intensity was calculated and then averaged for experimental subgroups each containing 3 identical experiments. The distribution function of
the CG:TG ratios shows that measurement values of the different mixtures are well separated and therefore allow a high resolution detection
of the methylation level of a single CpG. b) Gender separation. The 20 CpG sites with the most signiﬁcant difference between female and
male samples are shown. Only non cell line leukemia and healthy control samples were used. As expected the absolute majority of the
signiﬁcant CpG dinucleotides come from the two Xchromosome genes (ELK1, AR). High probability of methylation corresponds to black,
uncertainty to grey and low probability to white. The labels on the left side of the plot are gene and CpG identiﬁers. The bottom to top ranking
of the CpGs is according to the signiﬁcance of the difference between the means of the two groups, estimated by a two sample ttest. Each
row corresponds to a single CpG and each column to the methylation levels of one sample. It is not necessary to explicitly know the mapping and
a nonlinear SVM can be trained efﬁciently by computing
only the kernel function. Here we will only use the linear
kernel k (xi , x j ) = xi · x j and the quadratic kernel
2
k (xi , x j ) = xi · x j + 1 .
In the next section we will compare SVMs trained on
different feature sets. In order to evaluate the prediction
performance of these SVMs we used a crossvalidation
method (Bishop, 1995). For each classiﬁcation task, the
samples were partitioned into 8 groups of approximately
equal size. Then the SVM predicted the class for the
test samples in one group after it had been trained using
the 7 other groups. The number of misclassiﬁcations was
counted over 8 runs of the SVM algorithm for a...
