This preview shows page 1. Sign up to view the full content.
Unformatted text preview: brief introduction
In order to obtain a better fit for the training data, we often want to increase the complexity of our RBF Network. By its construction, we know that to change the
complexity of a RBF Network, the only way is to add or decrease the number of basis functions. A large number of basis function yields a more complex network. In
theory, if we add enough basis functions, the RFB Network would work for any training; however, it doesn't mean this RBF Network model can generalize well. Therefore,
for the purpose of avoiding overfitting problem (see Notes below), we only want to increase the number of basis function to certain point, i.e. its optimal level.
For the model selection, what we usually do is estimate the training error. After working through the training error, we’ll see that the training error in fact can be
decomposed, and one component of training error is called Mean Squared Error (MSE). In the later notes, we will find that our final goal is to get a good estimate of MSE.
Moreover, in order to find an optimal model for our data, we select the model with the smallest MSE.
Now, let us introduce some notations that we will use in the analysis:
  the prediction model estimated by a RBF network from the training data
wikicour senote.com/w/index.php?title= Stat841&pr intable= yes 52/74 10/09/2013 Stat841  Wiki Cour se Notes   the real model (not null), and ideally, we want
  the training error
  the testing error
  the Mean Squared Error to be close to Note s
Being more complex isn’t always a good thing. Sometime, overfitting
(http://en.wikipedia.org/wiki/Overfitting) causes the model to lose its generality. For example in the graph
on left hand side, the data points are sampled from the model
, where
is a
linear function, which is shown by the blue line, and is additive Gaussian noise from
. The red
curve displayed in the graph shows the over fitted model. Clearly, this over fitted model only works for any
training data, and is useless for any further prediction when new data points are introd...
View Full
Document
 Winter '13

Click to edit the document details