In contrast a biased estimator with small mean square

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: biased (rather than mean- unbiased, the standard unbiasedness property). The property of median- unbiasedness is invariant under transformations while the property of mean- unbiasedness may be lost under nonlinear transformations. For example, while using an unbiased estimator with large mean square error to estimate the parameter, we highly risk a big error. In contrast, a biased estimator with small mean square error will well improve the precision of our prediction. Hence, our goal is to minimize . From figure 3, we can see that the relationship of the three parameters is: . Thus given the Mean Squared Error (MSE), if we have a low bias, then we will have a high variance and vice versa. A Test error is a good estimation on MSE. We want to have a somewhat balanced bias and variance (not high on bias or variance), although it will have some bias. wikicour senote.com/w/index.php?title= Stat841&pr intable= yes 43/74 10/09/2013 Stat841 - Wiki Cour se Notes Referring to Figure 2, overfitting happens after the point where training data (training sample line) starts to decrease and test data (test sample line) starts to increase. There are 2 main approaches to avoid overfitting: 1. Estimating error rate Empirical training error is not a good estimation Empirical test error is a better estimation Cross- Validation is fast Computing error bound (analytically) using some probability inequality. We will not discuss computing the error bound in class; however, a popular method for doing this computation is called VC Dimension (short for Vapnik–Chervonenkis Dimension). Information can be found from Andrew Moore (http://www.autonlab.org/tutorials/vcdim.html) and Steve Gunn (http://citeseerx.ist.psu.edu/viewdoc/download? doi=10.1.1.10.7171&rep=rep1&type=pdf) . 2. Regularization Use of shrinkage method Decrease the chance of overfitting by controlling the weights Example of under and overfitting in R To give further intuition of over and underfitting, consider this example. A simple quadratic data set with some random noise is generated, and then polynomials of varying degrees are fitted. The errors for the training set and a test set are calculated. > x< rom2001 &g...
View Full Document

Ask a homework question - tutors are online