This preview shows page 1. Sign up to view the full content.
Unformatted text preview: lidation set is used to deside the parameters and the optimal model, and the test set is used to estimate the
performance of the chosen model. A classical division is 50% for training set, and 25% each for validation set and
test set. All of them are randomly selected from the original data set.
Training set: a set of examples used for learning: to fit the parameters of the classifier.
Validation set: a set of examples used to tune the parameters of a classifier.
Test set: a set of examples used only to assess the performance of a fully trained classifier. Figure 1: Classical/Standard crossvalidation Then, we only use the part of our data marked as the "training set" to train our algorithm, while keeping the remaining
marked as the "validation set" untouched. As a result, the validation set will be totally unknown to the trained model. The error rate is then estimated by:
, where is the cardinality of the validation set. When we change the complexity, the error generated by the validation set will have the same behavior as the test set, so we are able to choose the best parameters to get
the lowest error. Kfold Cros s validation
Above is the simplest form of complexity control. However, in reality, it may be hard to collect data ??and
we usually suffer from the curse of dimensionality??, and a larger data set may be hard to come by.
Consequently, we may not be able to afford to sacrifice part of the limited resources. In this case we use
another method that addresses this problem, K fold cross validation.The advantage of K Fold Cross
validation is that all the examples in the dataset are eventually used for both training and testing. We divide
the data set into
subsets roughly equal in size. The usual choice is
.
Generally, how to choose : if
, leave one out, low bias, high variance. Each subset contains a single element, so the model is
trained with all except one point, and then validated using that point.
if , say 2 fold, 5 fold, high bias, low variance. Each subset contains approximately or of the data.
For every th
to estimate the prediction error part, we use the other parts to fit the model and test on the th part , where Figure 2: Kfold crossvalidation wikicour senote.com/w/index.php?tit...
View
Full
Document
This document was uploaded on 03/07/2014.
 Winter '13

Click to edit the document details