AsyncReview1

AsyncReview1 - Montek Singh COMP790­084 Oct 27, 2011...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Montek Singh COMP790­084 Oct 27, 2011 Introduction to Asynchronous Design ◦ What is asynchronous design? ◦ Why do we want to do it? Data Representation and Communication ◦ How is data represented in an asynchronous system? ◦ How is information exchanged? Introduction: Clocked Digital Design Most current digital systems are synchronous: Clock: a global signal that paces operation of all components clock Benefit of clocking: enables discrete­time representation all components operate exactly once per clock tick component outputs need to be ready by next clock tick allows “glitchy” or incorrect outputs between clock ticks 3 Microelectronics Trends Current and Future Trends: Significant Challenges Large­Scale “Systems­on­a­Chip” (SoC) 100 Million ~ 1 Billion transistors/chip Very High Speeds multiple GigaHertz clock rates Explosive Growth in Consumer Electronics demand for ever­increasing functionality … … with very low power consumption (limited battery life) Higher Portability/Modularity/Reusability “plug ’n play” components, robust interfaces 4 Challenges to Clocked Design Breakdown of Single­Clock Paradigm: Chip will be partitioned into multiple timing domains challenge: gluing together multiple timing domains – glue logic is susceptible to “metastability” (=incorrect values transferred) and latency overheads Increasing Difficulties with Clocked Design: Clock distribution: requires significant designer effort Performance bottleneck: a single slow component Clock burns large fraction of chip power (~40­70%) Fixed clock rate: poor match for designing reusable components interfacing with mixed­timing environments 5 What is Asynchronous Design? Digital design with no centralized clock Synchronization using local “handshaking” handshaking interface clock Synchronous System (Centralized Control) Asynchronous System (Distributed Control) 6 Why Asynchronous Design? (1) Higher Performance May obtain “average­case” operation (not “worst­case”) not limited by slowest component Avoids overheads of multi­GHz clock distribution Lower Power No clock power expended Inactive components consume negligible power Better Electromagnetic Compatibility Smooth radiation spectra: no clock spikes Much less interference with sensitive receivers [e.g., Philips pagers, smartcards] Greater Flexibility/Modularity Naturally adapt to variable­speed environments Supports reusable components 7 Why Asynchronous Design? (2) The world already is mostly asynchronous! Events at the level of (or in between) large­scale systems are asynchronous several seconds to several milliseconds e.g., PC­printer communication, keyboard inputs, network comm. Events at the board level (or between chips) are often asynchronous milliseconds to 100 nanoseconds e.g., CPU­memory interface, interface with I/O subsystem (interrupts) Events within a chip, at the level of functional units (e.g., adders, control logic) are currently mostly synchronous several nanoseconds to 100 picoseconds Events at the level of a single logic gate are asynchronous 10 picoseconds Events at the quantum level are asynchronous picoseconds to femtoseconds So, why bother with clocks at all?! make everything asynchronous greater elegance and robustness8 Challenges of Asynchronous Design Hazards: potential “glitches” on wire clock tick clean signals hazardous signals no problem no problem for clocked for clocked systems systems communication must be hazard­free! special design challenge = “hazard­free synthesis” Testability Issues: absence of clock means no “single­stepping” Lack of Commercial CAD Tools: chicken­and­egg problem 9 Asynchronous Design: Past & Present Async Design: In existence for 50 years, but … … many recent technical advances: Hazard­Free Circuit Design: several practical techniques for controllers [Stanford/Columbia] Design for Testability: several test solutions, e.g. Philips Research Maturing Computer­Aided­Design (“CAD”) Tools: software tools for automated design [Philips,Columbia,Manchester] recent DARPA program [Boeing,Philips,UNC,Columbia,…] Successful Fabricated Chips: embedded processors, high­speed pipelines, consumer electronics… 10 Recent Commercial Interest (1) Several commercial asynchronous chips: Philips: asynchronous 80c51 microcontrollers used in commercial pagers [1998] and smartcards [2001] Univ. of Manchester: async ARM processor [2000] Motorola: async divider in PowerPC chip [2000] HAL: async floating­point divider in HAL­I and II processors [early 1990’s] Recent experimental chips: IBM, Sun and Intel: fast pipelines, arbiters, instruction­length decoder… IBM/Columbia/UNC: asynchronous digital FIR filter Several recent startups: Handshake Solutions, Theseus Logic, Codetronix, Fulcrum, Silistix, … 11 Recent Commercial Interest (2) Major DARPA program: ~$13M Goals: commercial­strength automated CAD tool (=silicon compiler) – – – direct translation from algorithms to chip layout capable of producing chips with 50M transistors or more rich suite of analysis and optimization tools demonstration chip – Boeing application – show dramatic improvements in: design time, power consumption, noise pollution, speed (?) Team: led by Boeing async startups: Theseus, Handshake Solutions, Codetronix universities: UNC, Columbia, UW, OrSU 12 Data Representation and Communication 13 A 5­minute Homework Problem Alice and Bob live on opposite sides of a wide river: Alice Bob Alice is supposed to send a message (say, a “Yes”/”No”) across to Bob around midnight. Both have flashlights, but neither owns a watch. What should they do? Suggest several strategies, and discuss pros and cons of each. 14 Solution 1 Alice uses 2 lamps: 1 to indicate that she is ready with the message, and 1 for the message itself Bob uses 1 lamp: to indicate that he has received the message go t i t Alice rea dy yes /n o Bob 15 Solution 2 Alice uses 2 lamps: Green lamp to indicate “yes” Red lamp to indicate “no” Bob uses 1 lamp: to indicate that he has received the message go t i t Alice no yes Bob 16 Solution 3 What if Alice and Bob could keep time? Alice uses 1 lamp for the message: At 12 midnight: turns on lamp if message = “yes” At 12:01: turns lamp off Bob needs no lamps! Takes down the message between 12 and 12:01 Pros: Fewer signals, lesser processing needed Cons: Alice and Bob must keep their clocks closely synchronized If Bob’s watch is off by a minute, incorrect communication possible 17 Homework! Think of all scenarios in which Solution #1 can fail Are any of those scenarios a problem for Solution #2 as well? 18 Data Representation and Communication How is data represented in an asynchronous system? How is information exchanged?: control signaling (handshake styles) 19 Data Encoding: “Bundled Data” Single­rail “Bundled Datapath”: simplest approach widely used Features: datapath: 1 wire per bit (e.g. standard sync blocks) matched delay: produces delayed “done” signal worst­case delay: longer than slowest path request bit 1 bit n matched delay function block done bit 1 done indicates valid data bit m + Practical style: can reuse sync components; small area – Fixed (worst­case) completion time 20 Bundled Data: Completion Sensing Delay Matching: either single worst­case delay or, fine­grain delay request MUX done bank of delays delay selector Speculative completion: choose delay “on the fly” start with shortest delay; increase as needed 21 Data Encoding: Dual­Rail Dual­rail: uses 2 wires per data bit bit 1 bit 1 bit n bit n bit m Dual-rail Meaning code 00 01 10 11 “reset” value 0 value 1 value unused Each Dual­Rail Pair: provides both data value and validity + provides robust data­dependent completion – needs completion detectors 22 Dual­Rail: Completion Sensing Dual­Rail Completion Detector: combines dual­rail signals indicates when all bits are valid (or reset) C­element: C­element: if all inputs=1, output 1 if all inputs=1, output 1 if all inputs=0, output 0 bit0 OR bit1 OR bitn if all inputs=0, output 0 else, maintain output value else, maintain output value OR C Done OR together 2 rails per bit Merge results using a Müller “C­element” 23 Handshaking Styles: 4­phase 4­Phase: requires 4 events per handshake get ready for next event start event Request event done ready for next event Acknowledge + “Level­sensitive” simpler logic implementation – Overhead of “return­to­zero” (RTZ or resetting) extra events which do no useful computation 24 Handshaking Styles: 2­phase 2­Phase: requires 2 events per handshake a.k.a. transition signaling start next event start event Request event done next event done Acknowledge + Elegant: no return­to­zero – Slower logic implementation: logic primitives are inherently level­sensitive, not event­based (at least in CMOS) 25 Handshaking Styles: Pulse Mode Pulse Mode: combines benefits of 2­phase and 4­ phase use pulses to represent events start event start next event Request event done next event done Acknowledge + No return­to­zero (like 2­phase) + Level­based implementation (like 4­phase) – Need a timing constraint on pulse width 26 Handshaking Styles: Single­Track Single­Track: combines req and ack onto single wire! one wire used for bidirectional communication sender raises, receiver lowers req + ack Request Acknowledge req req ack ack + Efficient protocol: no return­to­zero, level­based – Need aggressive low­level design techniques much effort to ensure reliability, satisfy timing constraints 27 Handshaking + Data Representation Several combinations possible: dual­rail 4­phase, single­rail 4­phase, dual­rail 2­phase, and single­ rail 2­phase Example: dual­rail 4­phase bit 1 A bit m ack B dual­rail data: functions as an implicit “request” 4­phase cycle: between acknowledge and implicit request 28 Other Data Representation Styles Level­Encoded Dual­Rail (LEDR) 2 wires per bit: “data” and “phase” exactly one wire per bit changes value data phase if new value is different, “data” wire changes value else “phase” wire change value M­of­N Codes N wires used for a data word M wires (M <= N) change value Values of N and M: have impact on… information transmitted, power consumed and logic complexity Knuth codes, Huffman codes, … 29 Which to use? Depends on several performance parameters: speed single­rail vs. dual­rail – single­rail may be faster (if designed aggressively) – dual­rail may be faster (if completion times vary widely) 2­phase vs. 4­phase – 2­phase may be faster (if logic overhead is small) – 4­phase may be faster (if overhead of return­to­zero is small) power consumption 2­phase typically has fewer gate transitions ( lower power) amount of logic used (#gates/wires/pins chip area) single­rail needs fewer gates/wires/pins design and verification effort dual­rail, 1­of­N, M­of­N, Knuth codes…: – delay­insensitive: robust in the presence of arbitrary delays single­rail: requires greater timing verification effort 30 Homework! Suppose you are given N wires Which M­of­N encoding (i.e. what M) encodes most information? Suppose you have to encode 4­bit values Which M­of­N encoding yields fewest wires? Suppose you can switch at most 2 wires Which M­of­N encoding yields fewest wires for 4­bit values? 31 ...
View Full Document

This note was uploaded on 11/28/2011 for the course COMP 790 taught by Professor Staff during the Fall '08 term at UNC.

Ask a homework question - tutors are online