ml-lecture04

ml-lecture04 - Lecture 4 More on neural networks •...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lecture 4: More on neural networks • Finding a good network structure: overftting • Automatic construction o¡ network structure: – Constructive methods – Destructive methods • Other kinds o¡ networks – Autoencoders – Recurrent neural networks September 17, 2007 1 COMP-652 Lecture 4 How large should the network be? • Overftting occurs i¡ there are too many parameters compared to the amount o¡ data available • Choosing the number o¡ hidden units: – Too ¡ew hidden units do not allow the concept to be learned – Too many lead to slow learning and overftting – I¡ the n inputs are binary, log n is a good heuristic choice • Choosing the number o¡ layers – Always start with one hidden layer – Never go beyond 2 hidden layers, unless the task structure suggests something di¡¡erent September 17, 2007 2 COMP-652 Lecture 4 Overtraining in feed-forward networks 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01 5000 10000 15000 20000 Error Number of weight updates Error versus weight updates (example 1) Training set error Validation set error 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 1000 2000 3000 4000 5000 6000 Error Number of weight updates Error versus weight updates (example 2) Training set error Validation set error • This is a different form of overFtting, which occurs when weights take on large magnitudes, pushing the sigmoids into saturation • Effectively, as learning progresses, the network has more parameters • Use a validation set to decide when to stop training! September 17, 2007 3 COMP-652 Lecture 4 k-fold cross-validation 1. Split the training data into k partitions (folds) 2. Repeat k times: (a) Take one fold to be the test set (b) Take one fold to be the validation set (c) Take the remaining k- 2 folds to form the training set (d) We train the parameters on the training set, using the validation set to decide when to stop, then measure J train ( i ) and J test ( i ) on fold i 3. Report the average of J train ( i ) and the average of J test ( i ) , i = 1 , . . . k . Magic number: k = 10 . September 17, 2007 4 COMP-652 Lecture 4 More on cross-validation • It is good to ensure the same distribution of examples in each fold • If two algorithms are compared, it should be on the same folds • We get an idea not only of the average performance, but also of...
View Full Document

This note was uploaded on 09/04/2008 for the course COMP 652 taught by Professor Preicup during the Fall '07 term at McGill.

Page1 / 11

ml-lecture04 - Lecture 4 More on neural networks •...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online