quiz3_2008

6.034 Quiz 3 November 12, 2008 Name EMail

Problem 1: Nearest Neightbors and ID Trees (50 points) Taking advantage shifts in MIT's requirements, Johnathan decides to take a geology subject instead of 8.02. He begins by learning how to classify minerals as sedimentary or metamorphic. He decides to take advantage techniques he learned in 6.034. Johnathan has a data set of eight minerals classified as either sedimentary or metamorphic, with the following characteristics: Sample # Mineral Hardness Density Grainsize 1 Sedimentary 3 4 huge 2 2 3 normal 3 4 3 4 4 2 5 Metamorphic 2 5 tiny 6 3 2 7 4 5 8 5 3 He wants to classify these two minerals: Sample Mineral Hardness Density A ? 2 1 B ? 4 1 2
Part A: Nearest Neighbors (25 points) Johnathan decides to try using Nearest Neighbors to classify minerals A and B. To make it easier, he only uses the first two characteristics: hardness and density. Part A1 (17 points) On the following graph, draw the decision boundaries produced by 1-Nearest Neighbor. Ignore Samples A and B. Density 6 5 4 3 2 1 0 M M

Part A2 (8 points) How is Sample A classified by 1-NN? By 3-NN? How is Sample B classified by 1-NN? By 3-NN? 4
Part B: Identification Trees (25 points) Nora is not pleased by the results of Johnathan's work with Nearest Neighbors and so decides to use Identification Trees to classify Samples A and B. She decides to consider all three characteristics; hardness, density, and grainsize. Part B1 (12 points) Nora picks tests to use so as to minimize disorder. For the top of the tree, she considers the following three tests: TEST 1 Hardness > 3.5 TEST 2 Density > 4.5 TEST 3 Grainsize is huge, normal, or tiny Which test is best among the three tests listed?

