mar9 - STA 414/2104 Mar 9, 2010 Notes I Sample test...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: STA 414/2104 Mar 9, 2010 Notes I Sample test questions posted I Review and/or questions on Thursday this week I Test will have 3 questions: one from Sample test, one specific to 414/2104 I Extra Office Hour Monday, March 15, 3-4 I Watch web site for late breaking announcement re MidTerm 1 / 20 STA 414/2104 Mar 9, 2010 Neural Networks I feed forward single layer neural network I Y k = g k { k + M X m = 1 km ( m + p X = 1 m X ) } = f k ( X ) I ( x ) = 1 1 + e- x tanh ( x ) = e x- e- x e x + e- x , maps to (- 1 , + 1 ) 2 / 20 STA 414/2104 Mar 9, 2010 ... neural networks I Y k = g k { k + M X m = 1 km ( m + p X = 1 m X ) } = f k ( X ) I = ( m , m , k , k ) I R ( ) = N i = 1 K k = 1 { y ik- f k ( x i ) } 2 , or I R ( ) =- N i = 1 K k = 1 y ik log f k ( x i ) I dim ( ) = M ( p + 1 ) + K ( M + 1 )- regularization/shrinkage, also called weight decay I minimize R ( ) + J ( ) = R ( ) + X km 2 km + X m 2 m ! I standardize inputs to mean 0, variance 1 for regularization I backfitting algorithm for minimizing R ( ) described in 11.4; extension to R ( ) + J ( ) in 11.5.2 3 / 20 STA 414/2104 Mar 9, 2010 ... neural networks I nnet in MASS library: recommend ( 10- 4 , 10- 2 ) for squared error loss; ( . 01 ,. 1 ) for log-likelihood I compare Figure 11.4 top/bottom I results very sensitive to starting values: R ( ) has many local maxima I recommendation (Ripley): take average predictions over several nnet fits I weight decay seems to be more important than number of hidden units I See 11.7, 8, 9 for interesting examples where neural nets work well 4 / 20 STA 414/2104 Mar 9, 2010...
View Full Document

Page1 / 20

mar9 - STA 414/2104 Mar 9, 2010 Notes I Sample test...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online