hw12 - CS 170 Algorithms Fall 2014 David Wagner HW12 Due Dec 5 6:00pm Instructions This homework is due Friday December 5 at 6:00pm electronically via

# hw12 - CS 170 Algorithms Fall 2014 David Wagner HW12 Due...

• Homework Help
• 5

This preview shows page 1 - 3 out of 5 pages.

CS 170 Algorithms Fall 2014 David Wagner HW12 Due Dec. 5, 6:00pm Instructions. This homework is due Friday, December 5, at 6:00pm electronically via glookup. This homework assignment is a programming assignment that is based on a machine learning application. You may work individually or in groups of two for this assignment. You may not work with more than one other person. If you work in a group of two, both of you must turn in a solution, and you must use pair programming (the two of you write all code together) or implement everything individually; you may not split up the problems and submit code your partner wrote on their own (e.g., “you implement Problem 1, I’ll code up Problem 2” is not allowed). 1. (50 pts.)K-Nearest NeighborsDigit classification is a classical problem that has been studied in depth by many researchers and computerscientists over the past few decades. Digit classification has many applications: for instance, postal serviceslike the US Postal Service, UPS, and FedEx use pre-trained classifiers in order to speed up and accuratelyrecognize handwritten addresses.Today, over 95% of all handwritten addresses are correctly classifiedthrough a computer rather than a human manually reading the address.The problem statement is as follows: given an image of a single handwritten digit, build a classifier thatcorrectly predicts what the actual digit value of the image is. Thus, your classifier receives as input an imageof a digit, and must output a class in the set{0,1,2,...,9}. For this homework, you will attack this problemby using ak-nearest neighbors algorithm.We will give you a data set (a reduced version of the MNIST handwritten digit data set). Each image of adigit is a 28×28 pixel image. We have already extracted features, using a very simple scheme: each pixel isits own feature, so we have 282=784 features. The value of a feature is the intensity of that correspondingpixel, normalized to be in the range 0..1. We have preprocessed and vectorized these images into featurevectors for you. We have split the data set into training, validation, and test sets, and we’ve provided theclass of each image in the training and validation sets. Your job is to infer the class of each image in the testset. Here are five examples of images that might appear in this data set, to help you visualize the data:We want you to do the following steps:(i) Implement thek-nearest neighbors algorithm.You can implement it in any way you like, in anyprogramming language of your choice. Fork>1, decide on a rule for resolving ties (if there is a tie CS 170, Fall 2014, HW12 1