Stat841f09 - Wiki Course Notes

# E the model is more complex when the number of hidden

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: w , and the estimate changes when wikicour senote.com/w/index.php?title= Stat841&amp;pr intable= yes changes. However, we can estimate . . 56/74 10/09/2013 Stat841 - Wiki Cour se Notes , where is additive Gaussian noise . Suppose we do not know the variance of . Then, (28.2) Substitute (28.2) into (28.1), get (28.3) Figure 28.1: the training error will decrease and the MSE will increase when increasing the number of hidden units (i.e. the model is more complex). When the number of hidden units gets larger and larger, the training error will decrease until it approaches to . If training error equals , then no matter how large is, from (28.3) we can see the estimate of MSE will approach as well. However, in fact it does not happen since when training error is close to overfitting (http://en.wikipedia.org/wiki/Overfitting) happens, and MSE should increase instead of being close to . We can see it from the Figure 28.1. We can see is the average of . In order to deal with this problem, we can take the average for of each hidden unit. For example: we can first take 1 hidden unit, and take 10 hidden units in the next. Since in reality the value of is a constant adjustment to the data points, and doesn't depend on , using the average value for 1 to 10 hidden units has a firm theoretical basis. Figure 28.1: MSE vs err We can also see that unlike the classical Cross Validation (CV) or Leave one out (LOO) techniques, the SURE technique does not need to do the validation to find the optimal model. Hence, SURE technique uses less data than CV or LOO. It is suitable for the case that there is not enough data for validation. However, to implement SURE we need to find , which may not be trivial for models that do not have a closed- form solution. Kme ans Clus te ring Description: Kmeans clustering (http://en.wikipedia.org/wiki/K- means_clustering) is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. Th...
View Full Document

Ask a homework question - tutors are online