01LecSp11Introx6 - 1/19/11 Agenda •  •  • ...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1/19/11 Agenda •  •  •  •  •  CS 61C: Great Ideas in Computer Architecture (Machine Structures) Course Introduc-on Instructors: Randy H. Katz David A. PaFerson hFp://inst.eecs.Berkeley.edu/~cs61c/sp11 1/19/11 Spring 2011  ­ ­ Lecture #1 1 Great Ideas in Computer Architecture Administrivia PostPC Era: From Phones to Datacenters Technology Break Warehouse Scale Computers in Depth 1/19/11 •  It is about the hardware ­soZware interface Great Ideas in Computer Architecture Administrivia PostPC Era: From Phones to Datacenters Technology Break Warehouse Scale Computers in Depth 1/19/11 Spring 2011  ­ ­ Lecture #1 –  What does the programmer need to know to achieve the highest possible performance •  Languages like C are closer to the underlying hardware, unlike languages like Scheme! –  Allows us to talk about key hardware features in higher level terms –  Allows programmer to explicitly harness underlying hardware parallelism for high performance 3 Spring 2011  ­ ­ Lecture #1 1/19/11 Personal Mobile Devices Old School CS61c 1/19/11 2 CS61c is NOT really about C Programming Agenda •  •  •  •  •  Spring 2011  ­ ­ Lecture #1 5 1/19/11 Spring 2011  ­ ­ Lecture #1 4 New School CS61c Spring 2011  ­ ­ Lecture #1 6 1 1/19/11 Warehouse Scale Computer Old ­School Machine Structures Applica6on (ex: browser) Compiler SoHware Hardware Assembler Processor Opera6ng System (Mac OSX) Memory I/O system CS61c Instruc6on Set Architecture Datapath & Control Digital Design Circuit Design transistors Spring 2011  ­ ­ Lecture #1 7 New ­School Machine Structures (It’s a bit more complicated!) Project 1 So/ware Hardware •  Parallel Requests Assigned to computer e.g., Search “Katz” •  Parallel Threads Assigned to core e.g., Lookup, Ads Project 2 •  Parallel Instrucgons >1 instrucgon @ one gme e.g., 5 pipelined instrucgons Input/Output Instrucgon Unit(s) >1 data item @ one gme e.g., Add of 4 pairs of words 8 Project 3 Core Funcgonal Unit(s) Layers of Representagon/Interpretagon Moore’s Law Principle of Locality/Memory Hierarchy Parallelism Performance Measurement & Improvement Dependability via Redundancy A0+B0 A1+B1 A2+B2 A3+B3 •  Hardware descripgons Main Memory All gates funcgoning in parallel at same gme 1/19/11 Core Memory (Cache) •  Parallel Data 1.  2.  3.  4.  5.  6.  Computer … Core Spring 2011  ­ ­ Lecture #1 6 Great Ideas in Computer Architecture Smart Phone Warehouse Scale Computer Harness Parallelism & Achieve High Performance 1/19/11 Logic Gates Project 4 9 Spring 2011  ­ ­ Lecture #1 Compiler Assembly Language Program (e.g., MIPS) Assembler Machine Language Program (MIPS) $t0, 0($2) $t1, 4($2) $t1, 0($2) $t0, 4($2) 0000 1010 1100 0101 1001 1111 0110 1000 1100 0101 1010 0000 Anything can be represented as a number, i.e., data or instrucgons 0110 1000 1111 1001 1010 0000 0101 1100 1111 1001 1000 0110 0101 1100 0000 1010 1000 0110 1001 1111 ! Machine Interpreta4on Hardware Architecture Descrip6on (e.g., block diagrams) Architecture Implementa4on Gordon Moore Intel Cofounder B.S. Cal 1950! Logic Circuit Descrip6on 1/19/11 (Circuit Schema6c Diagrams) Spring 2011  ­ ­ Lecture #1 10 #2: Moore’s Law temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw lw sw sw Spring 2011  ­ ­ Lecture #1 Predicts: 2X Transistors / chip every 2 years Great Idea #1: Levels of Representagon/Interpretagon High Level Language Program (e.g., C) 1/19/11 # of transistors on an integrated circuit (IC) 1/19/11 11 1/19/11 Spring 2011  ­ ­ Lecture #1 Year 12 2 1/19/11 Great Idea #3: Principle of Locality/ Memory Hierarchy 1/19/11 Spring 2011  ­ ­ Lecture #1 Great Idea #4: Parallelism 13 1/19/11 Great Idea #5: Performance Measurement and Improvement Spring 2011  ­ ­ Lecture #1 14 Great Idea #6: Dependability via Redundancy •  Matching applicagon to underlying hardware to exploit: –  Locality –  Parallelism –  Special hardware features, like specialized instrucgons (e.g., matrix manipulagon) •  Redundancy so that a failing piece doesn’t make the whole system fail 1+1=2 2 of 3 agree •  Latency –  How long to set the problem up –  How much faster does it execute once it gets going –  It is all about -me to finish 1+1=2 1+1=2 1+1=1 FAIL! Increasing transistor density reduces the cost of redundancy 1/19/11 Spring 2011  ­ ­ Lecture #1 15 1/19/11 Spring 2011  ­ ­ Lecture #1 16 1/19/11 Spring 2011  ­ ­ Lecture #1 18 Great Idea #6: Dependability via Redundancy •  Applies to everything from datacenters to storage to memory –  Redundant datacenters so that can lose 1 datacenter but Internet service stays online –  Redundant disks so that can lose 1 disk but not lose data (Redundant Arrays of Independent Disks/RAID) –  Redundant memory bits of so that can lose 1 bit but no data (Error Correcgng Code/ECC Memory) 1/19/11 Spring 2011  ­ ­ Lecture #1 17 3 1/19/11 Agenda •  •  •  •  1/19/11 Spring 2011  ­ ­ Lecture #1 19 Great Ideas in Computer Architecture Administrivia Technology Break From Phones to Datacenters 1/19/11 Course Informagon •  Discussions and labs will be held this week –  Randy Katz, Dave PaFerson •  Teaching Assistants: –  Andrew Gearhart, Conor Hughes, Yunsup Lee, Ari Rabkin, Charles Reiss, Andrew Waterman, Vasily Volkov •  Textbooks: Average 15 pages of reading/week –  PaFerson & Hennessey, Computer Organiza-on and Design, 4th Edigon (not ≤3rd Edigon, not Asian version 4th edigon) –  Kernighan & Ritchie, The C Programming Language, 2nd Edigon –  Barroso & Holzle, The Datacenter as a Computer, 1st Edi-on •  Google Group: –  Switching Secgons: if you find another 61C student willing to swap discussion AND lab, talk to your TAs –  Partner (only project 3 and extra credit): OK if partners mix secgons but have same TA •  First homework assignment due this Sunday January 23rd by 11:59:59 PM –  There is reading assignment as well on course page –  61CSpring2011UCB ­announce: announcements from staff –  61CSpring2011UCB ­disc: Q&A, discussion by anyone in 61C –  Email Andrew Gearhart agearh@gmail.com to join Spring 2011  ­ ­ Lecture #1 21 1/19/11 Course Organizagon –  Pargcipagon and Altruism (5%) –  Homework (5%) –  Labs (20%) –  Projects (40%) 1.  Data Parallelism (Map ­Reduce on Amazon EC2) 2.  Computer Instrucgon Set Simulator (C) 3.  Performance Tuning of a Parallel Applicagon/Matrix Mulgply using cache blocking, SIMD, MIMD (OpenMP, due with partner) 4.  Computer Processor Design (Logisim) –  Extra Credit: Matrix Mulgply Compeggon, anything goes –  Midterm (10%): 6 ­9 PM Tuesday March 8 –  Final (20%): 11:30 ­2:30 PM Monday May 9 Spring 2011  ­ ­ Lecture #1 Spring 2011  ­ ­ Lecture #1 22 EECS Grading Policy •  Grading 1/19/11 20 Reminders •  Course Web: hFp://inst.eecs.Berkeley.edu/~cs61c/sp11 •  Instructors: 1/19/11 Spring 2011  ­ ­ Lecture #1 23 •  hFp://www.eecs.berkeley.edu/Policies/ugrad.grading.shtml “A typical GPA for courses in the lower division is 2.7. This GPA would result, for example, from 17% A's, 50% B's, 20% C's, 10% D's, and 3% F's. A class whose GPA falls outside the range 2.5  ­ 2.9 should be considered atypical.” •  Fall 2010: GPA 2.81 Fall Spring 26% A's, 47% B's, 17% C's, 2010 2.81 2.81 3% D's, 6% F's 2009 2.71 2.81 •  Job/Intern Interviews: They grill you with technical quesgons, so 2008 2.95 2.74 it’s what you say, not your GPA 2007 2.67 2.76 (New 61c gives good stuff to say) 1/19/11 Spring 2011  ­ ­ Lecture #1 24 4 1/19/11 Policy on Assignments and Independent Work Late Policy •  Assignments due Sundays at 11:59:59 PM •  Late homeworks not accepted (100% penalty) •  Late projects get 20% penalty, accepted up to Tuesdays at 11:59:59 PM –  No credit if more than 48 hours late –  No “slip days” in 61C •  Used by Dan Garcia and a few faculty to cope with 100s of students who oZen procrasgnate without having to hear the excuses, but not widespread in EECS courses •  More late assignments if everyone has no ­cost opgons; beFer to learn now how to cope with real deadlines 1/19/11 Spring 2011  ­ ­ Lecture #1 25 •  With the excepgon of laboratories and assignments that explicitly permit you to work in groups, all homeworks and projects are to be YOUR work and your work ALONE. •  You are encouraged to discuss your assignments with other students, and extra credit will be assigned to students who help others, pargcularly by answering quesgons on the Google Group, but we expect that what you hand is yours. •  It is NOT acceptable to copy solugons from other students. •  It is NOT acceptable to copy (or start your) solugons from the Web. •  We have tools and methods, developed over many years, for detecgng this. You WILL be caught, and the penalges WILL be severe. •  At the minimum a ZERO for the assignment, possibly an F in the course, and a leFer to your university record documengng the incidence of cheagng. •  (We caught people last semester!) 1/19/11 Spring 2011  ­ ­ Lecture #1 26 The Rules (and we really mean it!) 1/19/11 Spring 2011  ­ ­ Lecture #1 27 1/19/11 Architecture of a Lecture Administrivia Tech “And in break conclusion…” 0 20 25 50 53 28 Agenda Full AFengon Spring 2011  ­ ­ Lecture #1 •  •  •  •  •  Great Ideas in Computer Architecture Administrivia PostPC Era: From Phones to Datacenters Technology Break Warehouse Scale Computer in Depth 78 80 Time (minutes) 1/19/11 Spring 2011  ­ ­ Lecture #1 29 1/19/11 Spring 2011  ­ ­ Lecture #1 30 5 1/19/11 Minicomputer Eras: 1970s Computer Eras: Mainframe 1950s ­60s Processor (CPU) I/O “Big Iron”: IBM, UNIVAC, … build $1M computers for businesses => COBOL, Fortran, gmesharing OS Using integrated circuits, Digital, HP… build $10k computers for labs, universiges => C, UNIX OS 1/19/11 1/19/11 Spring 2011  ­ ­ Lecture #1 31 PC Era: Mid 1980s  ­ Mid 2000s Spring 2011  ­ ­ Lecture #1 32 PostPC Era: Late 2000s  ­ ?? Using microprocessors, Apple, IBM, … build $1k computer for 1 person => Basic, Java, Windows OS 1/19/11 Spring 2011  ­ ­ Lecture #1 33 1/19/11 Advanced RISC Machine (ARM) instrucgon set inside the iPhone Personal Mobile Devices (PMD): Relying on wireless networking, Apple, Nokia, … build $500 smartphone and tablet computers for individuals => Objecgve C, Android OS Cloud Compugng: Using Local Area Networks, Amazon, Google, … build $200M Warehouse Scale Computers with 100,000 servers for Internet Services for PMDs => MapReduce, Ruby o011 R ails #1 Spring 2 n  ­ ­ Lecture iPhone Innards Processor 34 I/O 1 GHz ARM Cortex A8 I/O Memory You will how to design and program a related RISC computer: MIPS 1/19/11 Spring 2011  ­ ­ Lecture #1 35 You will about mulgple processors, data level parallelism, caches in 61C 1/19/11 Spring 2011  ­ ­ Lecture #1 I/O 36 6 1/19/11 The Big Switch: Cloud Compugng “A hundred years ago, companies stopped generagng their own power with steam engines and dynamos and plugged into the newly built electric grid. The cheap power pumped out by electric ugliges didn’t just change how businesses operate. It set off a chain reacgon of economic and social transformagons that brought the modern world into existence. Today, a similar revolugon is under way. Hooked up to the Internet’s global compugng grid, massive informagon ­processing plants have begun pumping data and soZware code into our homes and businesses. This gme, it’s compugng that’s turning into a uglity.” Spring 2011  ­ ­ Lecture #1 37 1/19/11 Why Cloud Compugng Now? •  “ The Web Space Race”: Build ­out of extremely large datacenters (10,000’s of commodity PCs) –  Build ­out driven by growth in demand (more users) ⇒ Infrastructure soZware and Operagonal expergse •  Discovered economy of scale: 5 ­7x cheaper than provisioning a medium ­sized (1000 servers) facility •  More pervasive broadband Internet so can access remote computers efficiently •  Commodigzagon of HW & SW –  Standardized soZware stacks 38" Coping with Failures Agenda •  4 disks/server, 50,000 servers •  Failure rate of disks: 2% to 10% / year •  •  •  •  •  –  Assume 4% annual failure rate •  On average, how oZen does a disk fail? a)  b)  c)  d)  1/19/11 1 / month 1 / week 1 / day 1 / hour Spring 2011  ­ ­ Lecture #1 39 Great Ideas in Computer Architecture Administrivia PostPC Era: From Phones to Datacenters Technology Break Warehouse Scale Computers in Depth 1/19/11 Coping with Failures •  Massive scale datacenters: 10,000 to 100,000 servers + networks to connect them together –  Emphasize cost ­efficiency –  AFengon to power: distribugon and cooling –  Assume 4% annual failure rate •  On average, how oZen does a disk fails? 1/19/11 1 / month 50,000 x 4 = 200,000 disks 1 / week 200,000 x 4% = 8000 disks fail 1 / day 365 days x 24 hours = 8760 hours 1 / hour Spring 2011  ­ ­ Lecture #1 40 Warehouse Scale Computers •  4 disks/server, 50,000 servers •  Failure rate of disks: 2% to 10% / year a)  b)  c)  d)  Spring 2011  ­ ­ Lecture #1 41 •  Homogeneous hardware/soZware •  Offer small number of very large applicagons (Internet services): search, social networking, video sharing •  Very highly available: <1 hour down/year –  Must cope with failures common at scale 1/19/11 Spring 2011  ­ ­ Lecture #1 42 7 1/19/11 E.g., Google’s Oregon WSC Equipment Inside a WSC Server (in rack format): 1 ¾ inches high “1U”, x 19 inches x 16 ­20 inches: 8 cores, 16 GB DRAM, 4x1 TB disk 1/19/11 Spring 2011  ­ ­ Lecture #1 43 Array (aka cluster): 16 ­32 server racks + larger local area network 7 foot Rack: 40 ­80 servers + Ethernet switch (“array switch”) local area network (1 ­10 Gbps) switch 10X faster => cost 100X: in middle (“rack switch”) 2 1/19/11 Spring 2011  ­ ­ Lecture #1 cost f(N ) 44 Server, Rack, Array Google Server Internals Google Server 1/19/11 Spring 2011  ­ ­ Lecture #1 45 Datacenter Power 1/19/11 Spring 2011  ­ ­ Lecture #1 46 Coping with Performance in Array Lower latency to DRAM in another server than local disk Higher bandwidth to local disk than to DRAM in another server Local Rack Array Racks -- 1 30 Servers 1 80 2400 Cores (Processors) 8 640 19,200 DRAM Capacity (GB) 16 1,280 38,400 Disk Capacity (GB) DRAM Latency (microseconds) 4,000 320,000 9,600,000 Spring 2011  ­ ­ Lecture #1 47 100 300 10,000 11,000 12,000 DRAM Bandwidth (MB/sec) 1/19/11 0.1 Disk Latency (microseconds) Peak Power % 20,000 100 10 Disk Bandwidth (MB/sec)2011  ­ ­ Lecture #1 200 Spring 100 148 0 1/19/11 8 1/19/11 Impact of latency, bandwidth, failure, varying workload on WSC soZware? Workload Coping with Workload Variagon •  WSC SoZware must take care where it places data within an array to get good performance •  WSC SoZware must cope with failures gracefully •  WSC SoZware must scale up and down gracefully in response to varying demand •  More elaborate hierarchy of memories, failure tolerance, workload accommodagon makes WSC soZware development more challenging than soZware for single computer 2X Midnight Noon Midnight •  Online service: Peak usage 2X off ­peak 1/19/11 Spring 2011  ­ ­ Lecture #1 49 1/19/11 Power vs. Server Uglizagon •  •  •  •  •  Spring 2011  ­ ­ Lecture #1 51 PUE in the Wild (2007) •  Overall WSC Energy Efficiency: amount of computagonal work performed divided by the total energy used in the process •  Power Usage Effecgveness (PUE): Total building power / IT equipment power –  An power efficiency measure for WSC, not including efficiency of servers, networking gear –  1.0 = perfecgon 1/19/11 Spring 2011  ­ ­ Lecture #1 Power Distribugon Unit Servers + Networking Spring 2011  ­ ­ Lecture #1 52 High PUE: Where Does Power Go? Uninterruptable Power Supply (baFery) 1/19/11 50 Power Usage Effecgveness Server power usage as load varies idle to 100% Uses ½ peak power when idle! Uses ⅔ peak power when 10% uglized! 90%@ 50%! Most servers in WSC uglized 10% to 50% Goal should be Energy ­Propor-onality: % peak load = % peak energy 1/19/11 Spring 2011  ­ ­ Lecture #1 53 1/19/11 Spring 2011  ­ ­ Lecture #1 Chiller cools warm water from Air Condigoner Computer Room Air Condigoner 54 9 1/19/11 Google WSC A PUE: 1.24 Containers in WSCs Inside WSC 1.  Careful air flow handling •  Don’t mix server hot air exhaust with cold air (separate warm aisle from cold aisle) •  Short path to cooling so liFle energy spent moving cold or hot air long distances •  Keeping servers inside containers helps control air flow 1/19/11 Spring 2011  ­ ­ Lecture #1 55 1/19/11 Google WSC A PUE: 1.24 Spring 2011  ­ ­ Lecture #1 Spring 2011  ­ ­ Lecture #1 56 Google WSC A PUE: 1.24 2.  Elevated cold aisle temperatures •  81°F instead of tradigonal 65° ­ 68°F •  Found reliability OK if run servers hoFer 3.  Use of free cooling •  Cool warm water outside by evaporagon in cooling towers •  Locate WSC in moderate climate so not too hot or too cold 1/19/11 Inside Container 4.  Per ­server 12 ­V DC UPS •  Rather than WSC wide UPS, place single baFery per server board •  Increases WSC efficiency from 90% to 99% 5.  Measure vs. esgmate PUE, publish PUE, and improve operagon 57 1/19/11 Google WSC PUE: Quarterly Avg Spring 2011  ­ ­ Lecture #1 58 Summary •  CS61c: Learn 6 great ideas in computer architecture to enable high performance programming via parallelism, not just learn C 1.  2.  3.  4.  5.  6.  PUE •  www.google.com/corporate/green/datacenters/measuring.htm 1/19/11 Spring 2011  ­ ­ Lecture #1 59 Layers of Representagon/Interpretagon Moore’s Law Principle of Locality/Memory Hierarchy Parallelism Performance Measurement and Improvement Dependability via Redundancy •  Post PC Era: Parallel processing, smart phone to WSC •  WSC SW must cope with failures, varying load, varying HW latency bandwidth •  WSC HW sensigve to cost,  ­ ­eecture #1 efficiency nergy 1/19/11 Spring 2011 L 60 10 ...
View Full Document

Ask a homework question - tutors are online