classifiers, using the 70 features and 6561 emails. We do not perform any preliminary variable selection since most classifiers discussed here can perform automatic variable selection. To be fair, we use L1-SVM and penalized LR, where variable selection is performed automatically. We test NNet using different numbers of units in the hidden layer (i.e. different sizes (s)) ranging from 5 to 35. Further, we apply different weight decays (w) on the interconnections, ranging from 0.1 to 2.5. We find that a NNet with s = 35 and w = 0.7 achieves the maximum AUC of 98.80%. RF is optimized by choosing the number of trees used. Specifically, the number of trees we consider in this experiment is between 30 and 500. When using 50 trees on our dataset, RF achieves the maximum AUC of 95.48%. We use the L1-SVM C-Classification machine with radial basis function (RBF) kernels. L1-SVM can automatically select input variables by suppressing parameters of irrelevant variables to zero. To achieve the maximum AUC over different parameter values, we consider cost of constraints violation values (i.e. the “c” constant of the regularization term in the Lagrange formulation) between 1 and 16, and values of the γparameter in the kernels between 1 ×10-8and 2. We find that γ= 0.1 and c = 12 achieve the maximum AUC of 97.18%. In LR we use penalized LR and apply different values of the lambda regularization parameter under the L2 norm, ranging from 1 ×10-8to 0.01. In our dataset λ= 1 ×10-4 achieves the maximum AUC of 54.45%. We use two BART models; the first is the original model and as usual, we refer to this as “BART”. The second model is the one we modify so as to be applicable to classification, referred to as “CBART”. We test both models using different numbers of trees ranging from 30 to 300. Also, we apply different power parameters for the tree prior, to specify the depth of the tree, ranging from 0.1 to 2.5. We find that BART with 300 trees and power = 2.5 achieves the maximum AUC of 97.31%. However, CBART achieves the maximum AUC of 99.19% when using 100 trees and power = 1.

Hardening Email Security via Bayesian Additive Regression Trees 203 4.4 Experimental results In this section we present the experimental results be measuring the AUC using the complete dataset. In addition, we compare the precision, recall, f1, and WErr measures using the optimum parameters achieved from the previous section. Figure 15 illustrates the ROCs for all classifiers. Fig. 15. ROC for all classifiers using the complete dataset. Table 3 illustrates the AUC, FP, FN, presicion, recall, f1, and WErr for all classi_ers. Note that the FPrate = 1 - precision and the FNrate = 1 - recall. Table 3. Classifiers AUC, WErr, precision, recall, f1, false positive, false negative. In Table 4, we compare p-value of the error rate for each subsample among the 10 subsamples in cross validation by applying the Wilcoxon signed-rank test. Since CBART has a comparable error rate to that of RF, SVM, and NNet, we merely compare these three classifiers.