Stat841f09 - Wiki Course Notes

Figure 1 classicalstandard cross validation then we

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: lidation set is used to deside the parameters and the optimal model, and the test set is used to estimate the performance of the chosen model. A classical division is 50% for training set, and 25% each for validation set and test set. All of them are randomly selected from the original data set. Training set: a set of examples used for learning: to fit the parameters of the classifier. Validation set: a set of examples used to tune the parameters of a classifier. Test set: a set of examples used only to assess the performance of a fully trained classifier. Figure 1: Classical/Standard cross-validation Then, we only use the part of our data marked as the "training set" to train our algorithm, while keeping the remaining marked as the "validation set" untouched. As a result, the validation set will be totally unknown to the trained model. The error rate is then estimated by: , where is the cardinality of the validation set. When we change the complexity, the error generated by the validation set will have the same behavior as the test set, so we are able to choose the best parameters to get the lowest error. K-fold Cros s -validation Above is the simplest form of complexity control. However, in reality, it may be hard to collect data ??and we usually suffer from the curse of dimensionality??, and a larger data set may be hard to come by. Consequently, we may not be able to afford to sacrifice part of the limited resources. In this case we use another method that addresses this problem, K- fold cross- validation.The advantage of K- Fold Cross validation is that all the examples in the dataset are eventually used for both training and testing. We divide the data set into subsets roughly equal in size. The usual choice is . Generally, how to choose : if , leave one out, low bias, high variance. Each subset contains a single element, so the model is trained with all except one point, and then validated using that point. if , say 2- fold, 5- fold, high bias, low variance. Each subset contains approximately or of the data. For every th to estimate the prediction error part, we use the other parts to fit the model and test on the th part , where Figure 2: K-fold cross-validation wikicour senote.com/w/index.php?tit...
View Full Document

This document was uploaded on 03/07/2014.

Ask a homework question - tutors are online