CS411 - MapReduce - Note 1 - 2

Map reduce 14 of 44 mapreduce abstraction mapreduce

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: reports for popular queries Large- scale machine learning problems. ... What is MapReduce? (2 of 3) Map Reduce (13 of 44) Existing MapReduce and Similar Systems • Google MapReduce • Support C++, Java, Python, Sawzall, ... • Based on Proprietary infrastructres For example, Hadoop: • GFS, Sawzall, Chubby, BigTable • And some open source libraries http://en.wikipedia.org/wiki/Apache_Had oop • Hadoop Map- Reduce • Open Source (Kudos to Doug and the team) • Plus the whole equivalent package, and more • HDFS, Map- Reduce, Pig, Zookeeper, Hbase, Hive • Used by Yahoo!, Facebook, Amazon, ... • Dryad • Proprietary, based on Microsoft SQL Servers • Dyrad, DyradLINQ What is MapReduce? (3 of 3) Some examples of mapreduce applications. Map Reduce (14 of 44) MapReduce Abstraction MapReduce Abstraction (0 of 9) Map Reduce (15 of 44) IntuitioExample 2: Average Income • Problem: Compute average income in a city for a given year (e.g., 2007) • Input: • Personal Information: <SSN, Personal Info> This time we create two types of key- value pairs. See solution in next- >next slide! • E.g. <“12345”, {John Smith, Sunnyvale, CA}> • Income Information: <SSN, {year, income}> • E.g. <“12345”, {2007, $72000}>, <“98765”, {2013, $12344}> • Output: Average income in each city in 2007 • E.g. <Sunnyvale, 12000>, <Champaign, 2000>, ... Example From Zhao et al. :”MapReduce: The Programming Model and Practice” MapReduce Abstraction (7 of 9) Map Reduce (22 of 44) How to Design Map() & Reduce()? MapReduce Abstraction (8 of 9) Map Reduce (23 of 44) In mapper 1a, we project SSN, city. Solution In mapper 1b, we project SSN, 2007 income In reducer 1, same SSN come together, and do a join. In mapper 2, use SSN as a key, project city 2007 income MapReduce Abstraction (9 of 9) Map Reduce (24 of 44) In reducer 2, use city as a key, do the aggregate, calculate the average Bad example: data & operation is already relational so don’t need Map Reduce, can just use Relational Operators. MapReduce Architecture MapReduce Architecture (0 of 8) Map Reduce (25 of 44) There are concrete slides of steps in the following 6 slides. Map Reduce Architecture So for the whole framework, MapReduce Architecture (1 of 8) Map Reduce (26 of 44) 1. It will fork(know how much resource you have) 2. Move the data 3. Assign mapped function, assign reduce worker 4. Generate the reduce worker, do all...
View Full Document

This note was uploaded on 01/28/2014 for the course CS 411 taught by Professor Staff during the Fall '08 term at University of Illinois, Urbana Champaign.

Ask a homework question - tutors are online