This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: CS221 Problem Set #2 Programming Part 1 CS 221, Autumn 2007 Problem Set #2 Programming Part Classifiers Due: 11:59pm, Tuesday October 30. 1 Overview For this problem, your programming team (1 to 3 people) will use an ensemble of decision trees to build a handwritten digit recognizer. The set of handwritten characters is supplied for you. They come from the post office in Buffalo, NY, where a classifier like yours would greatly speed up delivery. The goal of the recognizer is to get 100% of the characters correctly classified as the appropriate digit, but the problem with real-world data is that it is noisy. Some of these digits could not even be pre-classified by people, so getting as close as possible to 100% is a more feasible goal. The overall structure of the problem set is as follows: We consider two types of classifiers for the data. One is a single decision tree, and the other is an ensemble of decision trees constructed using the Bagging algorithm. In this problem set, we will explore varying the classifiers along different dimensions: the size of the training set, the depth of the decision tree(s), and the number of elements in the ensemble. 1.1 Data Set The data set consists of images of centered hand-written digits. Each image is a 14 14 matrix where each pixels intensity is in the range (0 , 255), which we have normalized to be in the range (0 , 1). For now, lets just consider the case of distinguishing 0s from 1s; well label 1 the positive class. Note that we are also providing you with your own independent test set, which you will use to evaluate your algorithm. You must not use your test data to train or tune the parameters of your model. The test data is there to gauge how well your classifier performs on unseen data. If you use it to train, your graphs will not show the desired behaviors and you wont get full credit. 1.2 Decision Trees Since our data involves continuous variables, we need to extend our notion of splits in a decision tree. We use the following approach to build a decision tree on this input: Each node in the tree chooses a single pixel and a threshold value. If the value of that pixel in an image is greater than the threshold, it sends the digit down the left branch. Otherwise, it sends it down the right. We use decision trees in two ways. The decision tree learning algorithm, as described in class, assumes that each leaf in the tree is annotated with the distribution of positive/negative instances that reach that leaf. Building on this algorithm, we use bagging to generate several such trees, and choose the output predicted by the largest number of these trees. CS221 Problem Set #2 Programming Part 2 1.3 Bagging The second method we will use is the Bagging algorithm. Given a training set of size N , bagging generates several training sets by sampling N training examples with replacement from the original training set....
View Full Document
This note was uploaded on 11/30/2009 for the course CS 221 taught by Professor Koller,ng during the Winter '09 term at Stanford.
- Winter '09
- Artificial Intelligence