Stat841f09 - Wiki Course Notes

# Stat841f09 Wiki Course Notes

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Complexity Control October 30, 2009 There are two issues (http://academicearth.org/lectures/underfitting- and- overfitting) that we have to avoid in Machine Learning: 1. Overfitting (http://en.wikipedia.org/wiki/Overfitting) 2. Underfitting Overfitting occurs when our model is heavily complex with so many degrees of freedom, that we can learn every detail of the training set. Such a model will have very high precision on the training set but will show very poor ability to predict outcomes of new instances, especially outside the domain of the training set.Dangerous for the overfitting:it will easily lead the predictions to the range that is far beyond the training data, and produce wild predictions in multilayer perceptrons even with noise- free data.The best way to avoid overfitting is to use lots of training data. In a Neural Network if the depth is too much, the network will have many degrees of freedom and will learn every characteristic of the training data set. That means it will show a very precise outcome of the training set but will not be able to generalize the commonality of the training set to predict the outcome of new cases. Underfitting occurs when the model we picked to describe the data is not complex enough, and has high error rate on the training set. There is always a trade- off. If our model is too simple, underfitting could occur and if it is too complex, overfitting can occur. Figure 2. The overfitting model passes through all the points of the training set, but has poor predictive power for new points. In exchange the line model has some error on the training points but has extracted the main characteristic of the training points, and has good predictive power. Example 1. Consider the example showed in the figure. We have a training set and we want to find a model which fits it the best. We can find a polynomial of high degree which almost passes through all the points in the training set. But, in fact the training set is coming from a line model. Now the problem is although the complex model has less error on the training set it diverges...
View Full Document

## This document was uploaded on 03/07/2014.

Ask a homework question - tutors are online