Many instances of the reducer also run in parallel

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: rd // output_values: a list of counts result = 0 for each v in intermediate_values: result += int(v) Emit(str(result)) Example: Coun<ng Word Occurrences (2) 24 inputToMap = [london, bridge, is, falling, down,
 falling, down, falling, down, london, bridge, is, falling, down,
 my, fair, lady] outputFromMap = [(london, 1), (bridge,1), (is,1), (falling,1), (down,1), (falling,1), …, (lady, 1)] inputToReduce = outputFromMap outputFromReduce = [(london, 2), (bridge, 2), (is, 2), (falling, 4), (down, 4), (my, 1), (fair, 1), (lady, 1) The Map and Reduce Phases 25     Records from the data source (lines out of files, rows of a database, etc.) are fed into the map func<on as (key,value) pairs, e.g., (filename, line). Many parallel instances of the mapper produce one or more intermediate values along with an output key from the input.       Afer the map phase is over, all the intermediate values for a given output key are combined together into a list. Many parallel instances of the reducer combine those intermediate values int...
View Full Document

This document was uploaded on 02/10/2014.

Ask a homework question - tutors are online