02LecSp12MapReducex6 - 1/21/12 Review •  CS61c:...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1/21/12 Review •  CS61c: Learn 6 great ideas in computer architecture to enable high performance programming via parallelism, not just learn C CS 61C: Great Ideas in Computer Architecture (formerly called Machine Structures) Map Reduce 1.  2.  3.  4.  5.  6.  Instructor: David A. PaDerson hDp://inst.eecs.Berkeley.edu/~cs61c/sp12 1/21/12 Spring 2012  ­ ­ Lecture #2 1 •  Post PC Era: Parallel processing, smart phones to WSC •  WSC SW must cope with failures, varying load, varying HW latency bandwidth •  WSC HW sensiRve to cost,  ­ ­eecture #2 efficiency nergy 1/21/12 Spring 2012 L 2 New ­School Machine Structures (It’s a bit more complicated!) Today’s Lecture Agenda So,ware Hardware •  Parallel Requests Assigned to computer e.g., Search “Katz” •  Parallel Threads Assigned to core e.g., Lookup, Ads Smart Phone Warehouse Scale Computer Harness Parallelism & Achieve High Performance Computer •  Parallel InstrucRons >1 instrucRon @ one Rme e.g., 5 pipelined instrucRons •  Parallel Data >1 data item @ one Rme e.g., Add of 4 pairs of words •  Hardware descripRons All gates @ one Rme •  Programming Languages 1/21/12 … Core Core Memory (Cache) Input/Output InstrucRon Unit(s) Layers of RepresentaRon/InterpretaRon Moore’s Law Principle of Locality/Memory Hierarchy Parallelism Performance Measurement and Improvement Dependability via Redundancy Core FuncRonal Unit(s) A0+B0 A1+B1 A2+B2 A3+B3 Cache Memory Request Level Parallelism DataLevel Parallelism MapReduce Administrivia + 61C in the News + The secret to gejng good grades at Berkeley •  MapReduce Examples •  Technology Break •  Costs in Warehouse Scale Computer (if Rme permits) •  •  •  •  Logic Gates Spring 2012  ­ ­ Lecture #2 3 Request ­Level Parallelism (RLP) 1/21/12 Spring 2012  ­ ­ Lecture #2 4 Google Query ­Serving Architecture •  Hundreds or thousands of requests per second –  Not your laptop or cell ­phone, but popular Internet services like Google search –  Such requests are largely independent •  Mostly involve read ­only databases •  LiDle read ­write (aka “producer ­consumer”) sharing •  Rarely involve read–write data sharing or synchronizaRon across requests •  ComputaRon easily parRRoned within a request and across different requests 1/21/12 Spring 2012  ­ ­ Lecture #2 5 1/21/12 Spring 2012  ­ ­ Lecture #2 6 1 1/21/12 Anatomy of a Web Search Anatomy of a Web Search •  Google “Randy H. Katz” 1.  Direct request to “closest” Google Warehouse Scale Computer 2.  Front ­end load balancer directs request to one of many clusters of servers within WSC 3.  Within cluster, select one of many Google Web Servers (GWS) to handle the request and compose the response pages 4.  GWS communicates with Index Servers to find documents that contain the search words, “Randy”, “Katz”, uses locaRon of search as well 5.  Return document list with associated relevance score 1/21/12 Spring 2012  ­ ­ Lecture #2 7 •  In parallel, –  Ad system: books by Katz at Amazon.com –  Images of Randy Katz •  Use docids (document IDs) to access indexed documents •  Compose the page –  Result document extracts (with keyword in context) ordered by relevance score –  Sponsored links (along the top) and adverRsements (along the sides) 1/21/12 Spring 2012  ­ ­ Lecture #2 8 Anatomy of a Web Search •  ImplementaRon strategy –  Randomly distribute the entries –  Make many copies of data (aka “replicas”) –  Load balance requests across replicas •  Redundant copies of indices and documents –  Breaks up hot spots, e.g., “JusRn Bieber ” –  Increases opportuniRes for request ­level parallelism –  Makes the system more tolerant of failures 1/21/12 Spring 2012  ­ ­ Lecture #2 9 ☐ ☐ Spring 2012  ­ ­ Lecture #2 10 Data ­Level Parallelism (DLP) QuesRon: Which statements are NOT TRUE about about Request Level Parallelism? ☐ 1/21/12 •  2 kinds –  Lots of data in memory that can be operated on in parallel (e.g., adding together 2 arrays) –  Lots of data on many disks that can be operated on in parallel (e.g., searching for documents) RLP runs naturally independent requests in parallel RLP also runs independent tasks within a request RLP typically uses equal number of reads and writes •  Feb 28 lecture and 3rd project does Data Level Parallelism (DLP) in memory •  Today’s lecture and 1st project does DLP across 1000s of servers and disks using MapReduce ☐ 11 1/21/12 Spring 2012  ­ ­ Lecture #2 12 2 1/21/12 Problem Trying To Solve MapReduce SoluRon •  How process large amounts of raw data (crawled documents, request logs, …) every day to compute derived data (inverted indicies, page popularity, …) when computaRon conceptually simple but input data large and distributed across 100s to 1000s of servers so that finish in reasonable Rme? •  Challenge: Parallelize computaRon, distribute data, tolerate faults without obscuring simple computaRon with complex code to deal with issues •  Jeffrey Dean and Sanjay Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Communica=ons of the ACM, Jan 2008. 1/21/12 Spring 2012  ­ ­ Lecture #2 13 •  Apply Map funcRon to user supplied record of key/value pairs •  Compute set of intermediate key/value pairs •  Apply Reduce operaRon to all values that share same key to combine derived data properly –  Ouen produces smaller set of values –  Typically 0 or 1 output value per Reduce invocaRon •  User supplies Map and Reduce operaRons in funcRonal model so can parallelize, re ­execute for fault tolerance 1/21/12 Data ­Parallel “Divide and Conquer” (MapReduce Processing) –  Slice data into “shards” or “splits”, distribute these to workers, compute sub ­problem soluRons –  map(in_key,in_value)->list(out_key,intermediate value)! •  Processes input key/value pair •  Produces set of intermediate pairs –  Collect and combine sub ­problem soluRons –  reduce(out_key,list(intermediate_value))->list(out_value)! •  Combines all intermediate values for a parRcular key •  Produces a set of merged output values (usually just one) •  Fun to use: focus on problem, let MapReduce library deal with messy details 15 •  Web crawl: Find outgoing links from HTML documents, aggregate by target document •  Google Search: GeneraRng inverted index files using a compression scheme •  Google Earth: SRtching overlapping satellite images to remove seams and to select high ­quality imagery •  Google Maps: Processing all road segments on Earth and render map Rle images that display segments •  More than 10,000 MR programs at Google in 4 years, run 100,000 MR jobs per day (2008) Spring 2012  ­ ­ Lecture #2 2000 servers => ≈ 200,000 Map Tasks, ≈ 5,000 Reduce tasks 1/21/12 Spring 2012  ­ ­ Lecture #2 16 MapReduce Popularity at Google Google Uses MapReduce For … 1/21/12 Fine granularity tasks: many more map tasks than machines Bucket sort to get same keys together •  Reduce: Spring 2012  ­ ­ Lecture #2 14 MapReduce ExecuRon •  Map: 1/21/12 Spring 2012  ­ ­ Lecture #2 17 Number of MapReduce jobs Average completion time (secs) Server years used Input data read (TB) Intermediate data (TB) Output data written (TB) Average number servers / job 1/21/12 Aug-04 Mar-06 Sep-07 Sep-09 29,000 171,000 2,217,000 3,467,000 634 217 3,288 758 874 2,002 52,254 6,743 395 11,081 403,152 34,774 475 25,562 544,130 90,120 193 2,970 14,018 57,520 157 268 394 488 Spring 2012  ­ ­ Lecture #2 18 3 1/21/12 What if Ran Google Workload on EC2? Aug-04 Number of MapReduce jobs Average completion time (secs) Server years used Input data read (TB) Intermediate data (TB) Output data written (TB) Average number of servers per job Average Cost/job EC2 Annual Cost if on EC2 1/21/12 Mar-06 Sep-07 Sep-09 QuesRon: Which statements are NOT TRUE about about MapReduce? 29,000 171,000 2,217,000 3,467,000 634 217 3,288 758 874 2,002 52,254 6,743 395 11,081 403,152 34,774 475 25,562 544,130 90,120 193 2,970 14,018 57,520 157 $17 $0.5M 268 $39 $6.7M 394 $26 $57.4M 488 $38 $133.1M Spring 2012  ­ ­ Lecture #2 ☐ ☐ ☐ ☐ 19 20 Course OrganizaRon Late Policy •  Grading •  Assignments due Sundays at 11:59:59 PM •  Late homeworks not accepted (100% penalty) •  Late projects get 20% penalty, accepted up to Tuesdays at 11:59:59 PM –  ParRcipaRon and Altruism (5%) –  Homework (5%) –  Labs (20%) –  Projects (40%) 1.  Data Parallelism (Map ­Reduce on Amazon EC2, with partner) 2.  Computer InstrucRon Set Simulator (C) 3.  Performance Tuning of a Parallel ApplicaRon/Matrix MulRply using cache blocking, SIMD, MIMD (OpenMP, with partner) 4.  Computer Processor Design (Logisim) –  No credit if more than 48 hours late –  No “slip days” in 61C •  Used by Dan Garcia and a few faculty to cope with 100s of students who ouen procrasRnate without having to hear the excuses, but not widespread in EECS courses •  More late assignments if everyone has no ­cost opRons; beDer to learn now how to cope with real deadlines –  Extra Credit: Matrix MulRply CompeRRon, anything goes –  Midterm (10%): 6 ­9 PM Tuesday March 6, 155 Dwinelle –  How many taking CS188? –  Final (20%): 11:30 ­2:30 PM Wednesday May 9 1/21/12 Spring 2012  ­ ­ Lecture #1 21 1/21/12 YOUR BRAIN ON COMPUTERS; Hooked on Gadgets, and Paying a Mental Price NY Times, June 7, 2010, by MaD Richtel SAN FRANCISCO  ­ ­ When one of the most important e ­mail messages of his life landed in his in ­box a few years ago, Kord Campbell overlooked it. Not just for a day or two, but 12 days. He finally saw it while siuing through old messages: a big company wanted to buy his Internet start ­up. ''I stood up from my desk and said, 'Oh my God, oh my God, oh my God,' '' Mr. Campbell said. ''It's kind of hard to miss an e ­mail like that, but I did.'' The message had slipped by him amid an electronic flood: two computer screens alive with e ­mail, instant messages, online chats, a Web browser and the computer code he was wriRng. While he managed to salvage the $1.3 million deal auer apologizing to his suitor, Mr. Campbell conRnues to struggle with the effects of the deluge of data. Even auer he unplugs, he craves the sRmulaRon he gets from his electronic gadgets. He forgets things like dinner plans, and he has trouble focusing on his family. His wife, Brenda, complains, ''It seems like he can no longer be fully in the moment.'' 23 Spring 2012  ­ ­ Lecture #1 22 The Rules (and we really mean it!) This is your brain on computers. ScienRsts say juggling e ­mail, phone calls and other incoming informaRon can change how people think and behave. They say our ability to focus is being undermined by bursts of informaRon. These play to a primiRve impulse to respond to immediate opportuniRes and threats. The sRmulaRon provokes excitement  ­ ­ a dopamine squirt  ­ ­ that researchers say can be addicRve. In its absence, people feel bored. The resulRng distracRons can have deadly consequences, as when cellphone ­wielding drivers and train engineers cause wrecks. And for millions of people like Mr. Campbell, these urges can inflict nicks and cuts on creaRvity and deep thought, interrupRng work and family life. While many people say mulRtasking makes them more producRve, research shows otherwise. Heavy mulRtaskers actually have more trouble focusing and shujng out irrelevant informaRon, scienRsts say, and they experience more stress. And scienRsts are discovering that even auer the mulRtasking ends, fractured thinking and lack of focus persist. In other words, this is also your brain off computers. Fall 2010  ­ ­ Lecture #2 Users express computaRon as 2 funcRons, Map and Reduce, and supply code for them MapReduce works is good for tasks like Search and Matrix MulRply There are typically many more Map Tasks than Reduce Tasks (e.g., 40:1) 1/21/12 Spring 2012  ­ ­ Lecture #1 24 4 1/21/12 Peer InstrucRon Architecture of a Lecture Full Administrivia Tech “And in break conclusion…” ADenRon 0 20 25 50 53 Spring 2012  ­ ­ Lecture #1 –  <1 minute: decide yourself, vote –  <2 minutes: discuss in pairs, then team vote; flash card to pick answer •  Try to convince partner; learn by teaching 78 80 •  Mark and save flash cards (get in discussion secRon) Time (minutes) 1/21/12 •  Increase real ­Rme learning in lecture, test understanding of concepts vs. details mazur-www.harvard.edu/education/pi.phtml •  As complete a “segment” ask mulRple choice quesRon 1 25 61C in the News 4 •  Grad student said he figured finally it out –  (Mike Dahlin, now Professor at UT Texas) 61C 61C •  My quesRon: What is the secret? •  Do assigned reading night before, so that get more value from lecture •  61c Comment on End ­of ­Semester Survey: “I wish I had followed Professor PaDerson's advice and did the reading before each lecture.” 61C Spring 2012  ­ ­ Lecture #2 27 1/21/12 •  •  Request Level Parallelism DataLevel Parallelism MapReduce Administrivia + 61C in the News + The secret to gejng good grades at Berkeley •  MapReduce Examples •  Technology Break •  Costs in Warehouse Scale Computer (if Rme permits) •  •  •  •  Spring 2012  ­ ­ Lecture #2 Spring 2012  ­ ­ Lecture #2 28 MapReduce Processing Example: Count Word Occurrences Agenda 1/21/12 3 The Secret to Gejng Good Grades •  IEEE Spectrum Top 11 InnovaRons of the Decade 1/21/12 2 Pseudo Code: Simple case of assuming just 1 word per document Reduce sums all counts emiDed for a parRcular word map(String input_key, String input_value):! // input_key: document name! // input_value: document contents! for each word w in input_value:! EmitIntermediate(w, "1"); // Produce count of words! reduce(String output_key, Iterator intermediate_values):! // output_key: a word! // intermediate_values: a list of counts! int result = 0;! for each v in intermediate_values:! result += ParseInt(v); // get integer from key-value! Emit(AsString(result));! 29 1/21/12 Spring 2012  ­ ­ Lecture #2 30 5 1/21/12 Another Example: Word Index (How Ouen Does a Word Appear?) Types Distribute •  map (k1,v1) "list(k2,v2) " •  reduce (k2,list(v2)) "list(v2) " •  Input keys and values from different domain than output keys and values •  Intermediate keys and values from same domain as output keys and values that that is is that that is not is not is that it it is Map 1 Map 2 Map 3 Map 4 is 1, that 2 is 1, that 2 is 2, not 2 is 2, it 2, that 1 Shuffle is 1 ,1 ,1,2,2 it 2 that 2 ,2 ,2,1 not 2 Reduce 1 Reduce 2 is 6; it 2 not 2; that 5 Collect is 6; it 2; not 2; that 5 1/21/12 Spring 2012  ­ ­ Lecture #2 31 1/21/12 Spring 2012  ­ ­ Lecture #2 32 MapReduce Processing ExecuRon Setup •  Map invocaRons distributed by parRRoning input data into M splits –  Typically 16 MB to 64 MB per piece •  Input processed in parallel on different servers •  Reduce invocaRons distributed by parRRoning intermediate key space into R pieces –  E.g., hash(key) mod R •  User picks M >> no. servers, R > no. servers –  Big M helps with load balancing, recovery from failure –  One output file per R invocaRon, so not too many 1/21/12 Spring 2012  ­ ­ Lecture #2 33 MapReduce Processing Spring 2012  ­ ­ Lecture #2 34 MapReduce Processing 2. One copy—the master — is special. The rest are workers. The master picks idle workers and assigns each 1 of M map tasks or 1 of R reduce tasks. 1. MR 1st splits the input files into M “splits” then starts many copies of program on servers 1/21/12 Shuffle phase 1/21/12 Shuffle phase Spring 2012  ­ ­ Lecture #2 35 1/21/12 Shuffle phase Spring 2012  ­ ­ Lecture #2 36 6 1/21/12 MapReduce Processing Shuffle phase 1/21/12 MapReduce Processing (The intermediate key/value pairs produced by the map funcRon are buffered in memory.) 3. A map worker reads the input split. It parses key/ value pairs of the input data and passes each pair to the user ­defined map funcRon. Spring 2012  ­ ­ Lecture #2 37 MapReduce Processing 5. When a reduce worker has read all intermediate data for its parRRon, it bucket sorts using inter ­ mediate keys so that occur ­ rences of same keys are grouped together Spring 2012  ­ ­ Lecture #2 Spring 2012  ­ ­ Lecture #2 39 6. Reduce worker iterates over sorted intermediate data and for each unique intermediate key, it passes key and corresponding set of values to the user’s reduce funcRon. 1/21/12 MapReduce Processing 7. When all map tasks and reduce tasks have been completed, the master wakes up the user program. The MapReduce call in user program returns back to user code. Shuffle phase 1/21/12 38 MapReduce Processing (The sorRng is needed because typically many different keys map to the same reduce task ) Shuffle phase 1/21/12 4. Periodically, the buffered pairs are wriDen to local disk, parRRoned into R regions by the parRRoning funcRon. Output of MR is in R output files (1 per reduce task, with file names specified by user); ouen passed into another MR job so don’t combine The output of the reduce funcRon is appended to a final output file for this reduce parRRon. Shuffle phase Spring 2012  ­ ­ Lecture #2 40 Master Data Structurs •  For each map task and reduce task –  State: idle, in ­progress, or completed –  IdenRfy of worker server (if not idle) •  For each completed map task –  Stores locaRon and size of R intermediate files –  Updates files and size as corresponding map tasks complete •  LocaRon and size are pushed incrementally to workers that have in ­progress reduce tasks 1/21/12 Shuffle phase Spring 2012  ­ ­ Lecture #2 41 1/21/12 Spring 2012  ­ ­ Lecture #2 42 7 1/21/12 MapReduce Processing Time Line Show MapReduce Job Running •  ~41 minutes total –  ~29 minutes for Map tasks & Shuffle tasks –  ~12 minutes for Reduce tasks –  1707 worker servers used •  Map (Green) tasks read 0.8 TB, write 0.5 TB •  Shuffle (Red) tasks read 0.5 TB, write 0.5 TB •  Reduce (Blue) tasks read 0.5 TB, write 0.5 TB •  Master assigns map + reduce tasks to “worker” servers •  As soon as a map task finishes, worker server can be assigned a new map or reduce task •  Data shuffle begins as soon as a given Map finishes •  Reduce task begins as soon as all data shuffles finish •  To tolerate faults, reassign task if a worker server “dies” 1/21/12 Spring 2012  ­ ­ Lecture #2 43 1/21/12 Spring 2012  ­ ­ Lecture #2 44 1/21/12 Spring 2012  ­ ­ Lecture #2 45 1/21/12 Spring 2012  ­ ­ Lecture #2 46 1/21/12 Spring 2012  ­ ­ Lecture #2 47 1/21/12 Spring 2012  ­ ­ Lecture #2 48 8 1/21/12 1/21/12 Spring 2012  ­ ­ Lecture #2 49 1/21/12 Spring 2012  ­ ­ Lecture #2 50 1/21/12 Spring 2012  ­ ­ Lecture #2 51 1/21/12 Spring 2012  ­ ­ Lecture #2 52 1/21/12 Spring 2012  ­ ­ Lecture #2 53 1/21/12 Spring 2012  ­ ­ Lecture #2 54 9 1/21/12 MapReduce Failure Handling •  On worker failure: –  Detect failure via periodic heartbeats –  Re ­execute completed and in ­progress map tasks –  Re ­execute in progress reduce tasks –  Task compleRon commiDed through master •  Master failure: –  Could handle, but don't yet (master failure unlikely) •  Robust: lost 1600 of 1800 machines once, but finished fine 1/21/12 Spring 2012  ­ ­ Lecture #2 55 MapReduce Redundant ExecuRon 1/21/12 Spring 2012  ­ ­ Lecture #2 56 Impact on ExecuRon of Restart, Failure for 10B record Sort using 1800 servers No Backup Tasks (44% Longer) •  Slow workers significantly lengthen compleRon Rme Kill 200 workers (5% Longer) –  Other jobs consuming resources on machine –  Bad disks with sou errors transfer data very slowly –  Weird things: processor caches disabled (!!) •  SoluRon: Near end of phase, spawn backup backup copies of tasks –  Whichever one finishes first "wins" •  Effect: DramaRcally shortens job compleRon Rme –  3% more resources, large tasks 30% faster 1/21/12 Spring 2012  ­ ­ Lecture #2 57 MapReduce Locality OpRmizaRon during Scheduling 1/21/12 Spring 2012  ­ ­ Lecture #2 58 QuesRon: Which statements are NOT TRUE about about MapReduce? •  Master scheduling policy: –  Asks GFS (Google File System) for locaRons of replicas of input file blocks –  Map tasks typically split into 64MB (== GFS block size) –  Map tasks scheduled so GFS input block replica are on same machine or same rack •  Effect: Thousands of machines read input at local disk speed •  Without this, rack switches limit read rate 1/21/12 Spring 2012  ­ ­ Lecture #2 59 ☐ ☐ ☐ MapReduce divides computers into 1 master and N ­1 workers; masters assigns MR tasks Towards the end, the master assigns uncompleted tasks again; 1st to finish wins Map worker sorts by input keys to group all occurrences of same key ☐ 60 10 1/21/12 Agenda Design Goals of a WSC •  Unique to Warehouse ­scale Request Level Parallelism DataLevel Parallelism MapReduce Administrivia + 61C in the News + The secret to gejng good grades at Berkeley •  MapReduce Examples •  Technology Break •  Costs in Warehouse Scale Computer (if Rme permits) •  •  •  •  1/21/12 Spring 2012  ­ ­ Lecture #2 –  Ample parallelism: •  Batch apps: large number independent data sets with independent processing. Also known as Data ­Level Parallelism –  Scale and its Opportuni=es/Problems •  RelaRvely small number of these make design cost expensive and difficult to amorRze •  But price breaks are possible from purchases of very large numbers of commodity servers •  Must also prepare for high component failures –  Opera=onal Costs Count: •  Cost of equipment purchases << cost of ownership 61 1/21/12 WSC Case Study Server Provisioning WSC Power Capacity Power Usage Effectiveness (PUE) IT Equipment Power Share Power/Cooling Infrastructure IT Equipment Measured Peak (W) Assume Average Pwr @ 0.8 Peak # of Servers # of Servers # of Servers per Rack # of Racks Top of Rack Switches # of TOR Switch per L2 Switch # of L2 Switches # of L2 Switches per L3 Switch # of L3 Switches 1/21/12 •  US account pracRce separates purchase price and operaRonal costs •  Capital Expenditure (CAPEX) is cost to buy equipment (e.g. buy servers) •  OperaRonal Expenditure (OPEX) is cost to run equipment (e.g, pay for electricity used) Internet 5.36 MW 2.64 MW L3 Switch 46207 46000 L2 Switch … 40.00 1150 1150 16.00 TOR Switch 72 24.00 3 Server Rack … Spring 2012  ­ ­ Lecture #2 63 1/21/12 WSC Case Study Capital Expenditure (Capex) $88,000,000 $66,700,000 $12,810,000 $167,510,000 Spring 2012  ­ ­ Lecture #2 64 •  US account pracRce allow converRng Capital Expenditure (CAPEX) into OperaRonal Expenditure (OPEX) by amorRzing costs over Rme period –  Servers 3 years –  Networking gear 4 years –  Facility 10 years •  However, replace servers every 3 years, networking gear every 4 years, and facility every 10 years 1/21/12 Spring 2012  ­ ­ Lecture #2 Cost of WSC •  Facility cost and total IT cost look about the same Facility Cost Total Server Cost Total Network Cost Total Cost 62 Cost of WSC 8.00 MW 1.45 0.67 0.33 145.00 116.00 Spring 2012  ­ ­ Lecture #2 65 1/21/12 Spring 2012  ­ ­ Lecture #2 66 11 1/21/12 WSC Case Study OperaRonal Expense (Opex) Years AmorRzaRon Amor=zed Capital Expense Opera=onal Expense Server Network Facility Pwr&Cooling Other Amortized Cost Power (8MW) People (3) Total Monthly How much does a waD cost in a WSC? Monthly Cost 3 4 10 10 $66,700,000 $12,530,000 $88,000,000 $72,160,000 $15,840,000 $2,000,000 $295,000 55% 8% $625,000 17% $140,000 4% $3,060,000 $475,000 13% $85,000 2% $3,620,000 100% $0.07 $/kWh •  Monthly Power costs •  $475k for electricity •  $625k + $140k to amorRze facility power distribuRon and cooling •  60% is amorRzed power distribuRon and cooling 1/21/12 Spring 2012  ­ ­ Lecture #2 67 1/21/12 Years AmorRzaRon ☐ ☐ Amor=zed Capital Expense No: Capex Costs are 100:1 of OpEx savings! We don’t have enough informaRon to answer quesRon Yes: Return investment in a single year! Opera=onal Expense 1/21/12 January 2012 AWS Instances & Prices Standard Small Standard Large Standard Extra Large High-Memory Extra Large High-Memory Double Extra Large High-Memory Quadruple Extra Large High-CPU Medium High-CPU Extra Large Cluster Quadruple Extra Large Ratio Compute Compute Virtual Memory to Unit/ Units Cores (GB) Small Core $0.085 1.0 $0.340 4.0 $0.680 8.0 $0.500 5.9 $1.200 14.1 $2.400 28.2 $0.170 2.0 $0.680 8.0 $1.300 15.3 1.0 4.0 8.0 6.5 13.0 26.0 5.0 20.0 33.5 1 2 4 2 4 8 2 8 16 1.00 2.00 2.00 3.25 3.25 3.25 2.50 2.50 2.09 Disk Address (GB) 1.7 160 32 bit 7.5 850 64 bit 15.0 1690 64 bit 17.1 420 64 bit 34.2 850 64 bit 68.4 1690 64 bit 1.7 350 32 bit 7.0 1690 64 bit 23.0 1690 64 bit •  Closest computer in WSC example is Standard Extra Large •  @$0.11/hr, Amazon EC2 can make money! –  even if used only 50% of Rme 1/21/12 Spring 2012  ­ ­ Lecture #2 Monthly Cost Server Network Facility Pwr&Cooling Other Amortized Cost Power (8MW) People (3) Total Monthly 3 4 10 10 $66,700,000 $12,530,000 $88,000,000 $72,160,000 $15,840,000 $0.07 $/kWh 69 Per Hour 68 $2,000,000 $295,000 55% 8% $625,000 17% $140,000 4% $3,060,000 $475,000 13% $85,000 2% $3,620,000 100% •  $3.8M/46000 servers = ~$80 per month per server in revenue to break even •  ~$80/720 hours per month = $0.11 per hour •  So how does Amazon EC2 make money??? ☐ Instance Spring 2012  ­ ­ Lecture #2 WSC Case Study OperaRonal Expense (Opex) Flash memory is non ­volaRle, $20 / GB, 10 GB capacity, 0.01 WaDs. Disk $0.1/GB, 1000 GB, 10 WaDs. Should we replace Disk with Flash to save $? ☐ •  8 MW facility •  AmorRzed facility, including power distribuRon and cooling is $625k + $140k = $765k •  Monthly Power Usage = $475k •  WaD ­Year = ($765k+$475k)*12/8M = $1.86 or about $2 per year •  To save a waD, if spend more than $2 a year, lose money 71 Spring 2012  ­ ­ Lecture #2 70 Summary •  Request ­Level Parallelism –  High request volume, each largely independent of other –  Use replicaRon for beDer request throughput, availability •  MapReduce Data Parallelism –  Divide large data set into pieces for independent parallel processing –  Combine and process intermediate results to obtain final result •  WSC CapEx vs. OpEx –  Servers dominate cost –  Spend more on power distribuRon and cooling infrastructure than on monthly electricity costs –  Economies of scale mean WSC can sell compuRng as a uRlity 1/21/12 Spring 2012  ­ ­ Lecture #2 72 12 ...
View Full Document

Ask a homework question - tutors are online