HW5_ML_Gang_LIU - CS 6375 Machine Learning, Spring 2009 HW5...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS 6375 Machine Learning, Spring 2009 HW5 Gang LIU SID:11458407 Apr. 10, 2009 Email: [email protected] Solution(*) : 1 2 , 1 ( ) [ ( ) [ i t t i i M Bias g Variance g = ≤ ≤ = = ∑ ∑ ∑∑ M i i t t 2 t t 2 i 1 g(x)=E[g(x)]= g(x) M 1 g(x )-f(x )] N 1 g(x )-g(x )] NM where f(.) is true function. For (i) 1 i i g(x)=r 1 , 1 i i M = ≤ ≤ ∑ M 1 i i 1 g(x)=E[g(x)]= r M 2 ( ) [ t Bias g = ∑ t t 2 1 g(x )-f(x )] N ( ) [ t i Variance g = ∑∑ t t 2 i 1 g(x )-g(x )] NM To compare, we let Bias and Variance stand for bias and variance respectively. (ii) i g(x)=2 Variance is zero because we do not use the data and all i g(x) are the same. But the Bias is high unless f(x) is close to 2 for all x (iii) / N ∑ t i i t g(x)= r It increases the Variance because the different samples i X would have different averages. It decreases the Bias because we would expect the average in general to be a better estimate than the constant. (iV) min t t i i g(x)= r For simplification, we set min t 1 t i i r = r 1 1 min , 1 t i i i M = = <= ≤ ≤ ∑ ∑ M M t 1 i i i 1 1 g(x)=E[g(x)]= ( r ) r M M This means it is less than the result deduced in (i).From the following formulas, we know that both bias and variance will increase. 2 ( ) [ t Bias g = ∑ t t 2 1 g(x )-f(x )] N ( ) [ t i Variance g = ∑∑ t t 2 i 1 g(x )-g(x )] NM In sum: bias variance i Bias Variance ii >Bias iii <Bias >Varianc e iv >Bias >Varianc e *Reference: Ethem Alpaydin, Introduction to Machine Learning. p.77 2. Boosting. (30 pts) Three learning algorithms in a binary classification problem are applied independently to a set of 1000 training examples to train three classifiers. • Algorithm A produces Classifier A that correctly classifies 800 examples and incorrectly classifies 200 examples. • Algorithm B produces Classifier B that correctly classifies 800 examples and incorrectly classifies 200 examples. All the mistakes of Classifier B are on examples that were correctly classified by Classifier A....
View Full Document

This note was uploaded on 01/25/2012 for the course CS 6375 taught by Professor Yangliu during the Spring '09 term at University of Texas at Dallas, Richardson.

Page1 / 9

HW5_ML_Gang_LIU - CS 6375 Machine Learning, Spring 2009 HW5...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online