hw12 - CS 170 Algorithms Fall 2014 David Wagner HW12 Due Dec 5 6:00pm Instructions This homework is due Friday December 5 at 6:00pm electronically via

hw12 - CS 170 Algorithms Fall 2014 David Wagner HW12 Due...

This preview shows page 1 - 3 out of 5 pages.

CS 170 Algorithms Fall 2014 David Wagner HW12 Due Dec. 5, 6:00pm Instructions. This homework is due Friday, December 5, at 6:00pm electronically via glookup. This homework assignment is a programming assignment that is based on a machine learning application. You may work individually or in groups of two for this assignment. You may not work with more than one other person. If you work with a partner, you may only use code you wrote together via pair programming or code you wrote individually. For example, you may choose to do the following: for problem 1, you decide to use pair programming and for problem 2, you decide only to discuss your approaches to one another and use your own implementations. You may not turn in any code for which you were not involved in writing (for instance, “you implement Problem 1, I’ll implement Problem 2” is not allowed). In addition, you may not discuss your approaches with anyone other than your partner. 1. (50 pts.) K-Nearest Neighbors Digit classification is a classical problem that has been studied in depth by many researchers and computer scientists over the past few decades. Digit classification has many applications: for instance, postal services like the US Postal Service, UPS, and FedEx use pre-trained classifiers in order to speed up and accurately recognize handwritten addresses. Today, over 95% of all handwritten addresses are correctly classified through a computer rather than a human manually reading the address. The problem statement is as follows: given an image of a single handwritten digit, build a classifier that correctly predicts what the actual digit value of the image is. Thus, your classifier receives as input an image of a digit, and must output a class in the set { 0 , 1 , 2 ,..., 9 } . For this homework, you will attack this problem by using a k -nearest neighbors algorithm. We will give you a data set (a reduced version of the MNIST handwritten digit data set). Each image of a digit is a 28 × 28 pixel image. We have already extracted features, using a very simple scheme: each pixel is its own feature, so we have 28 2 = 784 features. The value of a feature is the intensity of that corresponding pixel, normalized to be in the range 0..1. We have preprocessed and vectorized these images into feature vectors for you. We have split the data set into training, validation, and test sets, and we’ve provided the class of each image in the training and validation sets. Your job is to infer the class of each image in the test set. Here are five examples of images that might appear in this data set, to help you visualize the data: We want you to do the following steps: CS 170, Fall 2014, HW12 1
Image of page 1
(i) Implement the k -nearest neighbors algorithm. You can implement it in any way you like, in any programming language of your choice. For k > 1, decide on a rule for resolving ties (if there is a tie for the majority vote among the k nearest neighbors when trying to classify a new observation, which one do you choose?).
Image of page 2
Image of page 3

You've reached the end of your free preview.

Want to read all 5 pages?

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture