Starter source code we have provided source code for

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: for each zoom level we intend to render. Starter Source Code We have provided source code for you to use to get started. These should strongly inform the design of your system. All of the source code is under the package edu.washington.cse490h.geo. This contains the following subpackages: • protocol - Defines data types which are marshaled from Mapper to Reducer and from Reducer to the next phase's Mapper. • mapred - This is where all your mapper and reducer classes should go Test Data Set Use the datasets given below for local testing and debugging. Do not use the entire US dataset for debugging. Download the King County TIGER/Line data from: This is about 10 MB compressed. Do all your local testing on this data set. Only run on the cluster after everything works perfectly locally. Download the BGN dataset for Washington state here: Download the Population dataset for Washington state here: wa_pop.txt Data Extraction and Data Types The TIGER data files come as a set of fields organized in records. Each record is a single line of text. The fields are fixed width, which means that there are no commas, tabs, or other delimiters marking the edges of fields. Instead, we know how many characters each field can take up, and all the text in those character designations is the field. For example, the following record has fields "A", "B", and "C", and is 15 characters wide. If A was a 3-character field, B is a 6-character field, and C is a 6-character field, the fields would occupy positions: 0123456789ABCDE (position in hex) AAABBBBBBCCCCCC (corresponding field) Some example records in this format may look like: 1 Foo 542571 276Record1337 Note how records that do not occupy their full width are padded on one side or the other. (The specification determines whether fields are left- or right-side-padded.) Oth...
View Full Document

This homework help was uploaded on 04/02/2014 for the course CSE 490 taught by Professor Staff during the Fall '08 term at University of Washington.

Ask a homework question - tutors are online