project - MapReduce ECE 563 Spring 2012 Default Course...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
MapReduce ECE 563 Spring 2012 Default Course Project version 1.0 (1/22/2012) Projects may be performed by teams of two people. If you would like to do another project, let’s talk about it. MapReduce is a programming model that involves two steps. The first, the map step, takes an input set I and groups into N equivalence classes I 0 , I 1 , I 2 , . .., I N-1 . I can be thought of as a set of tuples <key, data> , and the function map maps I into the equivalence classes based on the value of key . In the second reduce step, the equivalence classes are processed, and the set of tuples in an equivalence class I j are reduced into a single value. MapReduce has become very popular in part because of its use by Google, but is an old parallel programming model. It is surprisingly general. To perform a parallel MapReduce, the input is spread across the available processors. Each processor runs one or more instances of map , followed by executing one or more instances of reduce . Each instance of map will potentially form equivalence classes I 0 , I 1 , I 2 , . .., I N-1 . Consider the word counting problem, which can be solved in parallel using MapReduce. Given a list of words, the output should consist of how many times each word appeared in the list (or tex). Viewing the input as tuples, the word is the key , and the data is the constant 1. A naive map function would collect all instances of a word into an equivalence class. Each equivalence class would then be assigned to a processor n , and processor n would determine the cardinality of the equivalence class, which would be the word count. A more intelligent map function would form singleton equivalence classes I word , where the only element is <word, count> . The processor assigned I word to reduce would receive the I word equivalence classes from all of the map functions, and would perform a reduction on the class.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 2

project - MapReduce ECE 563 Spring 2012 Default Course...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online