01_Intro_MR_cluster_access

01_Intro_MR_cluster_access - § What if we want to get the...

Info iconThis preview shows pages 1–13. Sign up to view the full content.

View Full Document Right Arrow Icon
Introduction to MapReduce Programming & Local Hadoop Cluster Accesses Instructions Rozemary Scarlat August 31, 2011
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
(K1, V1) (K2, V2) (K2, List<V2>) (K3, V3) Dataflow in a MR Program
Background image of page 2
§ We have temperature readings for the years 1901 – 2001 and we want to compute the maximum for each year § In our temperature data set, each line looks like this: 0067011990999991950051507004. ..9999999N9+00001+99999999999. .. 0043011990999991950051512004. ..9999999N9+00221+99999999999. .. § We know that characters 16 – 19 represent the year, characters 88 – 92 represent the temperature and character 93 represents the quality code (0, 006701199099999 1950 051507004. ..9999999N9+ 0000 1+99999999999. ..) (106, 004301199099999 1950 051512004. ..9999999N9+ 0022 1+99999999999. ..)
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
( 0, 0067011. .. ) ( 1950, 22 ) Selection+ Projection 0 1950, 22 -11 ( 1950, 22 ) Aggregation (MAX) Implementation in MapReduce
Background image of page 4
Mapper
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Reducer
Background image of page 6
Main
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Beyond MaxTemperature
Background image of page 8
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 10
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 12
Background image of page 13
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: § What if we want to get the average temperature for a year? § What if you are only interested in the temperature in Durham? (Assume the station ID at Durham is 212) Local Hadoop Cluster § The master node is hadoop21.cs.duke.edu § The slave nodes are hadoop[22,24-36].cs.duke.edu § Online jobtracker address * : http://hadoop21.cs.duke.edu:50030/jobtracker.jsp § Online HDFS health * : http://hadoop21.cs.duke.edu:50070/dfshealth.jsp * Accesible only from within the CS trusted network. Solution: 1. ssh to any node and then use lynx. § Now, let’s see how to compile and run a MapReduce job on the local cluster § You can find the detailed instructions at the course website: http://www.cs.duke.edu/courses/fall10/cps216/TA_material/cluster_instructi ons Mapper (old API) Reducer (old API) Main (old API)...
View Full Document

This document was uploaded on 01/17/2012.

Page1 / 13

01_Intro_MR_cluster_access - § What if we want to get the...

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online