CSCI 1950-F Homework 3:
Handwritten Digit Classification
Brown University, Spring 2012
Homework due at 12:00pm on February 23, 2012
In this problem set, we consider the problem of handwritten digit recognition. We will
use a subset of the MNIST database, which has become a benchmark for testing a wide
range of classification algorithms. See
if you’d like
to read more about it. The particular version of the data which we’ll use, as well as some
useful Matlab scripts, are available here:
/course/cs195f/asgn/hw3_mnist/
This code should be used to load and define matrices of training and test data for the various
problems below.
In the MNIST database, each training or test example is a 28-by-28 grayscale image.
To ease programming of learning algorithms, these images have been converted to vectors
of length 28
2
= 784 by sorting the pixels in raster scan (row-by-row) order.
The Matlab
reshape
command can be used to convert these vectors back to images for visualization.
For example, we can plot the third training example of class 1 as follows:
>> imagesc(reshape(train1(3,:), 28, 28)’);
To reduce computational complexity and simulation time, in the following questions we focus
on only three of the ten handwritten digits: “1”, “2”, and “7”.