Lecture - DEUTSCH-FRANZSISCHE SOMMERUNIVERSITT FR...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
DEUTSCH-FRANZÖSISCHE SOMMERUNIVERSITÄT FÜR NACHWUCHSWISSENSCHAFTLER 2011 CLOUD COMPUTING : HERAUSFORDERUNGEN UND MÖGLICHKEITEN UNIVERSITÉ D’ÉTÉ FRANCO-ALLEMANDE POUR JEUNES CHERCHEURS 2011 CLOUD COMPUTING: DÉFIS ET OPPORTUNITÉS Hadoop MapReduce in Practice Speaker: Pietro Michiardi Contributors: Antonio Barbuzzi, Mario Pastorelli Institution: Eurecom Pietro Michiardi (Eurecom) Hadoop in Practice 1 / 76
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Introduction Overview of this Lecture Hadoop: Architecture Design and Implementation Details [45 minutes] I HDFS I Hadoop MapReduce I Hadoop MapReduce I/O Exercise Session: I Warm up: WordCount and first design patterns [45 minutes] I Exercises on various design patterns: [60 minutes] F Pairs F Stripes F Order Inversion [HomeWork] I Solved Exercise: PageRank in MapReduce [45 minutes] Pietro Michiardi (Eurecom) Hadoop in Practice 2 / 76
Background image of page 2
Hadoop MapReduce Hadoop MapReduce Pietro Michiardi (Eurecom) Hadoop in Practice 3 / 76
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Hadoop MapReduce Preliminaries From Theory to Practice The story so far I Concepts behind the MapReduce Framework I Overview of the programming model Terminology I MapReduce: F Job : an execution of a Mapper and Reducer across a data set F Task : an execution of a Mapper or a Reducer on a slice of data F Task Attempt : instance of an attempt to execute a task I Example: F Running “Word Count” across 20 files is one job F 20 files to be mapped = 20 map tasks + some number of reduce tasks F At least 20 attempts will be performed. .. more if a machine crashes Pietro Michiardi (Eurecom) Hadoop in Practice 4 / 76
Background image of page 4
Hadoop MapReduce HDFS in details HDFS in details Pietro Michiardi (Eurecom) Hadoop in Practice 5 / 76
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Hadoop MapReduce HDFS in details The Hadoop Distributed Filesystem Large dataset(s) outgrowing the storage capacity of a single physical machine I Need to partition it across a number of separate machines I Network-based system, with all its complications I Tolerate failures of machines Hadoop Distributed Filesystem[6, 7] I Very large files I Streaming data access I Commodity hardware Pietro Michiardi (Eurecom) Hadoop in Practice 6 / 76
Background image of page 6
Hadoop MapReduce HDFS in details HDFS Blocks (Big) files are broken into block-sized chunks I NOTE : A file that is smaller than a single block does not occupy a full block’s worth of underlying storage Blocks are stored on independent machines I Reliability and parallel access Why is a block so large? I Make transfer times larger than seek latency I E.g.: Assume seek time is 10ms and the transfer rate is 100 MB/s, if you want seek time to be 1% of transfer time, then the block size should be 100MB Pietro Michiardi (Eurecom) Hadoop in Practice 7 / 76
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Hadoop MapReduce HDFS in details NameNodes and DataNodes NameNode I Keeps metadata in RAM I Each block information occupies roughly 150 bytes of memory I Without NameNode , the filesystem cannot be used F Persistence of metadata: synchronous and atomic writes to NFS Secondary NameNode I Merges the namespce with the edit log I A useful trick to recover from a failure of the NameNode is to use the
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/16/2012 for the course BI 200 taught by Professor Potter during the Fall '11 term at Montgomery College.

Page1 / 76

Lecture - DEUTSCH-FRANZSISCHE SOMMERUNIVERSITT FR...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online