This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CS 6375 Machine Learning Homework 5 Due: 04/11/2008, 11:59pm 1. Bias and Variance. (15 pts) Alpaydin book, Chapter 4, problem 9. Let us say, given the samples } , { t i t i i r x X = , where t is the index for the instances in the training set, we define 1 ) ( i i r x g = , namely, our estimate for any x is the r value of the first instance in the (unordered) data set X i . What can you say about its bias and variance, as compared with g i (x)=2 and N r x g t t i i / ) ( ∑ = ? What if the sample is ordered, so that t i t i r x g min ) ( = ? Note: for this problem, you don’t need to prove it mathematically, a conceptual explanation is fine. Also the index i used in this problem is not the index for a training instance, rather it’s regarding the entire training set since you can have different data sets. Sol: Taking any instance has less bias than taking a constant but has higher variance. It has higher variance than the average and it may have higher bias. If the sample is ordered so that the instance we pick is the minimum, variance decreases (minimums tend to get more similar to each other) and bias may also increase. 2. Boosting. (30 pts) Three learning algorithms in a binary classification problem are applied independently to a set of 1000 training examples to train three classifiers. • Algorithm A produces Classifier A that correctly classifies 800 examples and incorrectly classifies 200 examples. • Algorithm B produces Classifier B that correctly classifies 800 examples and incorrectly classifies 200 examples. All the mistakes of Classifier B are on examples that were correctly classified by Classifier A. • Algorithm C produces Classifier C that correctly classifies 900 examples and incorrectly classifies 100 examples. All the mistakes of Classifier C are on examples that were correctly classified by both Classifier A and Classifier B. were correctly classified by both Classifier A and Classifier B....
View
Full
Document
This note was uploaded on 01/25/2012 for the course CS 6375 taught by Professor Yangliu during the Spring '09 term at University of Texas at Dallas, Richardson.
 Spring '09
 yangliu
 Machine Learning

Click to edit the document details