Stat841f09 - Wiki Course Notes

# Phptitle stat841pr intable yes 5474 10092013 stat841

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ttp://www.cs.ualberta.ca/~papersdb/uploaded_files/801/paper_automatic- basis- selection- for.pdf) Ca se 1 Consider the case in which a new data point has been introduced to the estimated model, i.e. . Since because the estimation of is a new point, and are independent. Therefore is from the training data, so ) and ; this new point belong to the validation set (or think about , when ,i.e. is a new point, then it has nothing with in this case can be written as: . wikicour senote.com/w/index.php?title= Stat841&pr intable= yes 54/74 10/09/2013 Stat841 - Wiki Cour se Notes This expectation means . Based on the notation we denote above, then we obtain: This is the justification behind the technique of cross validation. since is constant, to minimize overfitting or underfitting, a validation data set is independent from the estimated model. is equal to minimize the test err . In cross vaildation to avoid Ca se 2 A more interesting case is the case in which we do not use new data points to assess the performance of the estimated model. and the training data is used for both estimating and assessing a model . In this case the cross term in cannot be ignored because and are not independent. Therefore the cross term can be estimated by Stein's lemma, which was originally proposed to estimated the mean of a Guassian distribution. Suppose , then by applying Stein's lemma, we obtain proved above. . This expectation means . . In statistics, this is known as Stein's unbiased risk estimate (SURE) (http://www.reference.com/browse/Stein%27s+unbiased+risk+estimate) is an unbiased estimator of the mean- squared error of a given estimator, in a deterministic estimation scenario. In other words, it provides an indication of the accuracy of a given estimator. This is important since, in deterministic estimation, the true mean- squared error of an estimator generally depends on the value of the unknown parameter, and thus cannot be determined completely. SURE for RBF Network Based on SURE, the optimum number of basis functions should be assigned to have the minimum generalization err setting equal to zero , we ge...
View Full Document

## This document was uploaded on 03/07/2014.

Ask a homework question - tutors are online