HW5_ML_Gang_LIU

# HW5_ML_Gang_LIU - CS 6375 Machine Learning Spring 2009 HW5...

CS 6375 Machine Learning, Spring 2009 HW5 Gang LIU SID:11458407 Apr. 10, 2009 Email: [email protected] Solution(*) 1 2 , 1 ( ) [ ( ) [ i t t i i M Bias g Variance g = = = ∑∑ M i i t t 2 t t 2 i 1 g(x)=E[g(x)]= g(x) M 1 g(x )-f(x )] N 1 g(x )-g(x )] NM where f(.) is true function. For (i) 1 i i g(x)=r 1 , 1 i i M = M 1 i i 1 g(x)=E[g(x)]= r M

2 ( ) [ t Bias g = t t 2 1 g(x )-f(x )] N ( ) [ t i Variance g = ∑∑ t t 2 i 1 g(x )-g(x )] NM To compare, we let Bias and Variance stand for bias and variance respectively. (ii) i g(x)=2 Variance is zero because we do not use the data and all i g(x) are the same. But the Bias is high unless f(x) is close to 2 for all x (iii) / N t i i t g(x)= r It increases the Variance because the different samples i X would have different averages. It decreases the Bias because we would expect the average in general to be a better estimate than the constant. (iV) min t t i i g(x)= r For simplification, we set min t 1 t i i r = r 1 1 min , 1 t i i i M = = <= M M t 1 i i i 1 1 g(x)=E[g(x)]= ( r ) r M M This means it is less than the result deduced in (i).From the following formulas, we know that both bias and variance will increase. 2 ( ) [ t Bias g = t t 2 1 g(x )-f(x )] N ( ) [ t i Variance g = ∑∑ t t 2 i 1 g(x )-g(x )] NM In sum: bias variance i Bias Variance ii >Bias 0 iii <Bias >Varianc e
iv >Bias >Varianc e *Reference: Ethem Alpaydin, Introduction to Machine Learning. p.77 2. Boosting. (30 pts) Three learning algorithms in a binary classification problem are applied independently to a set of 1000 training examples to train three classifiers. Algorithm A produces Classifier A that correctly classifies 800 examples and incorrectly classifies 200 examples. Algorithm B produces Classifier B that correctly classifies 800 examples and incorrectly classifies 200 examples. All the mistakes of Classifier B are on examples that were correctly classified by Classifier A.

