Chap5.5-Regularization

# Chap5.5-Regularization - Machine Learning Srihari...

This preview shows pages 1–9. Sign up to view the full content.

Machine Learning Srihari Regularization in Neural Networks Sargur Srihari 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari Topics in Neural Network Regularization What is regularization? Methods 1. Determining optimal number of hidden units 2. Use of regularizer in error function Linear Transformations and Consistent Gaussian priors 3. Early stopping Invariances Tangent propagation Training with transformed data Convolutional networks Soft weight sharing 2
Machine Learning Srihari What is Regularization? In machine learning (also, statistics and inverse problems): introducing additional information to prevent over-fitting (or solve ill-posed problem) This information is usually a penalty for complexity, e.g., restrictions for smoothness bounds on the vector space norm Theoretical justification for regularization: attempts to impose Occam's razor on the solution From a Bayesian point of view Regularization corresponds to imposition of prior distributions on model parameters 3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari 1. Regularization by determining no. of hidden units Number of input and output units is determined by dimensionality of data set Number of hidden units M is a free parameter Adjusted to get best predictive performance Possible approach is to get maximum likelihood estimate of M for balance between under-fitting and over-fitting 4
Machine Learning Srihari Effect of Varying Number of Hidden Units Sinusoidal Regression Problem Two layer network trained on 10 data points M = 1, 3 and 10 hidden units Minimizing sum-of-squared error function Using conjugate gradient descent Generalization error is not a simple function of M due to presence of local minima in error function

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari Using Validation Set to determine no of hidden units 6 Number of hidden units , M Sum of squares Test error for polynomial data 30 random starts for each M Overall best validation Set performance happened at M=8 Plot a graph choosing random starts and different numbers of hidden units M
Machine Learning Srihari 2. Regularization using Simple Weight Decay Generalization error is not a simple function of M Due to presence of local minima Need to control network complexity to avoid over-fitting Choose a relatively large M and control complexity by addition of regularization term Simplest regularizer is weight decay Effective model complexity determined by choice of regularization coefficient λ Regularizer is equivalent to a zero mean Gaussian prior over weight vector w Simple weight decay has certain shortcomings 7 ˜ E (w) = E (w) + λ 2 w T w

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari Consistent Gaussian priors
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern