This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ttp://www.cs.ualberta.ca/~papersdb/uploaded_files/801/paper_automatic basis selection for.pdf) Ca se 1 Consider the case in which a new data point has been introduced to the estimated model, i.e.
. Since
because the estimation of is a new point, and are independent. Therefore is from the training data, so ) and ; this new point belong to the validation set
(or think about , when ,i.e. is a new point, then it has nothing with in this case can be written as: . wikicour senote.com/w/index.php?title= Stat841&pr intable= yes 54/74 10/09/2013 Stat841  Wiki Cour se Notes This expectation means . Based on the notation we denote above, then we obtain: This is the justification behind the technique of cross validation. since
is constant, to minimize
overfitting or underfitting, a validation data set is independent from the estimated model. is equal to minimize the test err . In cross vaildation to avoid Ca se 2 A more interesting case is the case in which we do not use new data points to assess the performance of the estimated model. and the training data is used for both
estimating and assessing a model . In this case the cross term in
cannot be ignored because and are not independent. Therefore the cross term can be estimated
by Stein's lemma, which was originally proposed to estimated the mean of a Guassian distribution.
Suppose , then by applying Stein's lemma, we obtain proved above. . This expectation means . . In statistics, this is known as Stein's unbiased risk estimate (SURE) (http://www.reference.com/browse/Stein%27s+unbiased+risk+estimate) is an unbiased estimator of the
mean squared error of a given estimator, in a deterministic estimation scenario. In other words, it provides an indication of the accuracy of a given estimator. This is
important since, in deterministic estimation, the true mean squared error of an estimator generally depends on the value of the unknown parameter, and thus cannot be
determined completely. SURE for RBF Network
Based on SURE, the optimum number of basis functions should be assigned to have the minimum generalization err
setting equal to zero , we ge...
View
Full
Document
 Winter '13

Click to edit the document details