Unformatted text preview: the higher-order function that controls the invocation of the user’s function, so we’re calling the latter the mapper : (map mapper mapper mapper data) Similarly, we’ll use reduce to refer to the higher-order function, and reducer to mean the user’s accumulation function.) ACCUMULATE MAP kv kv kv kv kv kv key REDUCE key REDUCE REDUCE key key key key (mapreduce f1 f2 base dataname) f1 f2 () MAP MAP MAP MAP SORT value value value value value value result result result f1 f2 (accumulate f2 base (map f1 data)) The argument to the mapper is always one kv-pair. Keys are typically used to keep track of where the data came from. For example, if the input consists of a bunch of Web pages, the keys might be their URLs. Another example we’ll be using is Project Gutenberg, an online collection of public-domain books; there the key would be the name of a book (more precisely, the filename of the file containing that book). In most uses of a-lists, there will only be one kv-pair with a given key, but that’s not true here; for example, each...
View Full Document
- Spring '10
- Project Gutenberg, Higher-order function, kv MAP kv