Ch10_RT - Optimizing Power @ Runtime Circuits and Systems...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Optimizing Power @ Runtime Circuits and Systems Jan M. Rabaey Low Power Design Essentials ©2008 Low Power Design Essentials Chapter 10 10.1 Chapter Outline § § § § § § Motivation behind run-time optimization Dynamic voltage and frequency scaling Adaptive body biasing General self-adaptation Aggressive deployment Power domains and power management Low Power Design Essentials 10.2 Why Run-Time Optimization for Power? § § Power dissipation strong function of activity In many applications, activity various strongly over time: – – Example 1: Operational load varies dramatically in general purpose computing. Some computations also requires faster response than others. Example 2: The amount of computation to be performed in many signal processing and communication functions (such as compression or filtering) is a function of the input data stream and its properties.. § § Optimum operation point in the performance-energy space hence varies over time Changes in manufacturing, environmental or aging conditions also lead to variable operation points Designs for a single fixed design point are sub-optimal Low Power Design Essentials 10.3 Variable Workload in Media Processing Example: Video Compression Typical MPEG IDCT Histogram True also for voice processing, graphics, multimedia and communications Low Power Design Essentials [Courtesy: A. Chandrakasan] 10.4 Variable Workloads in General-Purpose Computing Laptop CPU usage chart Dialup server Workstation File server [Ref: A. Sinha, VLSI’01] Workload traces of three processor styles over 60 sec’s Low Power Design Essentials 10.5 Adapting to Variable Workloads § § Goal: Position design in optimal operational point given required throughput Useful dynamic design parameters: VDD and VTH – Changing transistor sizes dynamically nontrivial § Variable supply voltage most effective for dynamic power reduction Low Power Design Essentials 10.6 Adjusting Only the Clock Frequency § § Often used in portable processors Only reduces power – leaves energy/operation constant – Does not save battery life Compute ASAP Delivered Throughput Excess throug hput time Always high throughput Clock Frequency Reduction fCLK Redu ced Energy/operation remains unchanged while throughput scales down with fCLK time 10.7 Low Power Design Essentials [Ref: T. Burd, UCB’01] Dynamic Voltage Scaling (DVS) Vary VDD and fCLK based on requested throughput Throughput Delivered time Matches execution speed to requirements • Minimizes average energy/operation • Extends battery life up to one order of magnitude with the exact same hardware! • Low Power Design Essentials [Ref: T. Burd, UCB’01] 10.8 Flashback: VDD and Throughput 1 Nominal operation point 0 . 9 0 . 8 0 . 7 0 . 6 0 . 5 0 . 4 0 . 3 normalized supply v v − vt α 1 f =( )() 1 − vt v With f and v the throughput and supply voltage normalized to the nominal values, and vt the ratio between threshold and nominal supply voltages. 0 . 7 0 . 8 0 . 9 1 f =v 0 . 3 0 0 0 . . . 4 5 normalized 6 performance f For α=2 and VDD >> VTH, f = v (long channel device) Low Power Design Essentials 10.9 Dynamic Voltage Scaling (DVS) Reduces Dynamic Energy/Operation Superlinearly 1 0 . 9 0 . 8 0 . 7 0 . 6 0 . 5 0 . 4 0 . 3 0 . 2 0 .0 1. 2 Nominal operation point (α=1.3, VDDnom/VTH = 4) normalized energy e 0 . 3 0 0 0 . . . 4 normalized 5 6 0 . 7 0 . 8 0 . 9 1 When performance is not needed, relax and save energy. Low Power Design Essentials 10. performance f Dynamic Voltage Scaling (DVS) Even more impressive when considering power 1 0 . 9 0 . 8 0 . 7 0 . 6 0 . 5 0 . 4 0 . 3 0 . 2 0 . 1 0 . 2 Nominal operation point (α =1.3, VDDnom/VTH = 4) normalized power p Third order reduction in power when scaling supply voltage with workload (for α = 2 and VDD >> VTH) 0 . 3 0 . 4 0 0 . . 5 6 normalized 0 . 7 0 . 8 0 . 9 1 performance f But … needs continuously variable supply voltage Low Power Design Essentials 10. Using Discrete Voltage Levels § § DVS needs close integration with voltage regulation Continuously variable supply voltage not always available 1 0 . 9 0 . 8 0 . 7 0 . 6 0 . 5 0 . 4 0 . 3 0 . 2 0 .0 1. 2 Nominal operation point (α=1.3, VDDnom/VTH = 4) normalized energy e Dithering supply voltage between discrete levels approximates continuous scaling VDDno m/2 0 . 3 0 0 0 . . . 4 normalized 5 6 0 . 7 0 . 8 0 . 9 1 performance f Low Power Design Essentials [Ref: V. Gutnik, Example: • Operate 50% of time at VDDnom, and 50% at VDDnom/2 • Reduces e to 0.625 for f = 0.74 • Continuous DVS would yield e ≈ 0.5 10. Challenge: Estimating the Workload § § § Adjusting supply voltage is not instantaneous and may take multiple clock cycles Efficiency of DVS strong function of accuracy in workload estimation Depending upon type of workload(s), their predictability and dynamism – – Stream-based computation General-purpose multi-processing Low Power Design Essentials 10. Example 1: Stream-based Processing § Examples: voice or multimedia processing Control VD D fcl k Stream in R E G FI F O Processor CLK F I F O R E G Stream out FIFO measures workload § Control dynamically adjusts VDD (and hence fclk) § Low Power Design Essentials [Ref: L. Nielsen, TVLSI’94] 10. Stream-based Processing and Voltage Dithering (also known as voltage hopping) MPEG-4 encoding Time #n #n+1 Normalized power 0. 8 0. 6 0. 4 0. 2 8 10. 1 Transition time between ƒ levels = 200µs Next milestone n-th slice finished here Two hopping levels are sufficient. 0 1 2 3 # of frequency levels Low Power Design Essentials [Ref: T. Sakurai, Relating VDD and fclk § Self-timed – – Avoids clock all-together Supply is set by close loop between VDD setting, processor speed, and FIFO occupation Closed loop compares desired and actual frequency Needs “dummy” critical path to estimate actual delay Stores relationship between fclk (processor speed) and VDD Obtained from simulations or calibration § On-Line Speed Estimation – – § Table-Look Up – – Low Power Design Essentials 10. On-Line Speed Estimation Batte ry DC/DC Frequency detector fre + Σ q - Loop Filter VD D Proc VCO factu al Simultaneously performs regulation and clock generation VCO sets clock frequency • Uses replica of critical path of processor Low Power Design Essentials 10. Table-Lookup Frequency-Voltage Translation User Logic Temp sensor VD D Ref Clk Calibration Unit (Delay analysis) Frequency-toVoltage Translation Table (F-V Table) Contr ol DCDC Conve rter Voltage-Frequency (V-F) relationship measured at start-up time (or periodically) Delay measurements for different voltages obtained from actual module or using array of ring oscillators § Inverse function (F-V) stored in look-up table, taking into account logic structure § Can compensate for temperature variations § Low Power Design Essentials [Ref: H. Okano, VLSI’06] 10. Example 2: General-Purpose Processor Applications supply completion deadlines. • Voltage Scheduler (VS) predicts workload to estimate CPU cycles. • Controller CPU cycles = Fdesired ∆time Required speed Clock & VDD FDESIRED (MHz) 80 60 40 20 0 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Processor Speed (MPEG) Software/ OS Hardware G.P. Processor Time (sec) Low Power Design Essentials 10. Impact of Voltage/Frequency Scheduling Normalized Energy Scheduling Algorithm ASAP Oracle Zero § Benchmarks MPEG 100% 67% 89% UI 100% 25% 30% Audio 100% 16% 22% Oracle: perfect knowledge of the future § Zero: heuristic scheduling algorithm § Largest savings when for less-demanding or bursty apps (UI, audio) § Difficult to get large gains in compute-intensive code (MPEG) Low Power Design Essentials [Ref: T. Pering, 10. Impact of Voltage Scheduling Example: User interface processing (very bursty) Compute ASAP: 3.5 Max. Speed 3.5 Increased speed for shorter process deadlines With Voltage Scheduler: VD D Idle 1.0 200ms/div VD D Low Speed & Idle 1.0 200ms/div High-latency computation done @ low speed/energy Low Power Design Essentials [Ref: T. Burd, 10. Converter Loop Sets VDD, fCLK Count er Latc h FME 7 AS fCL K Ring Oscillator PEN AB NEN AB Digital Loop Filter VB T RS T ID D Process or A f1M Hz 01101 00 Set by O.S. FD ES + Σ FE RR VDD L Regist er • Operating system set FDES • Ring oscillator delay-matched to CPU critical paths. • Feedback loop sets VDD so that FERR 0. Buck converter CD D Low Power Design Essentials [Ref: T. Burd, 10. A High-Performance Processor at Low-Energy 10 0 8 0 Dhrystone 2.1 MIPS 6 0 4 0 2 0 0 0 Dynamic VDD 6 MIPS @ 0.54 mW/ MIPS (1.2V) 2 85 MIPS @ 5.6 mW/MIPS (3.8V) Static VDD x 3 4 5 6 1 Energy (mW/MIPS) If processor in low-performance mode most of the time: 85 MIPS processor @ 0.54 mW/MIPS Low Power Design Essentials [Ref: T. Burd, JSSC’00] 10. Examples of DVS-Enabled Microprocessors § Early Research Prototypes – – Toshiba MIPS 3900: 1.3-1.9V, 10-40 MHz [Kuroda98] Berkeley ARM8: 1.2-3.8V, 6-85 MIPS, 0.54-5.6 mW/MIPS [Burd00] 0.7-1.75V, 200-1000MHz, 55-1500mW (typ) Max. Energy Efficiency: ~23 MIPS/mW 0.9-1.95V, 11-380MHz, 53-500mW (typ) Max. Energy Efficiency : ~11 MIPS/mW 0.8-1.3V, 300-1000MHz, 0.85-7.5W (peak) 0.95-1.5V, 600-1600MHz, 4.2-31W (peak) § Xscale: 180nm 1.8V bulk-CMOS – – § PowerPC: 180nm 1.8V bulk-CMOS – – § Crusoe: 130nm 1.5V bulk-CMOS – § Pentium M: 130nm 1.5V bulk-CMOS – § Extended to embedded processors (ARM, Freescale, TI, Fujitsu, NEC, …) Low Power Design Essentials 10. DVS Challenge: Verification § Functional verification – Circuit design constraints Circuit delay variation Noise margin reduction Delay sensitivities (local power grid) § Timing verification – § Power distribution integrity – – Need to verify at every voltage operation point? Low Power Design Essentials 10. Design for Dynamically Varying VDD § Logic needs to be functional under varying VDD – Careful choice of logic styles is important (static versus dynamic, tristate busses, memory cells, sense amplifiers § Also: need to determine max |dVDD/ dt| Low Power Design Essentials 10. Static CMOS Logic VD D VD D In = 0 Vout = VDD CL RDS,P MOS Vo ut C L Static CMOS operates robustly with varying VDD Low Power Design Essentials 10. Dynamic Logic VD D clk clk = 1 Errors Vo ut Vi n clk Volt s VD D Vo ut Time • ∆ V False logic low: ∆ VDD > DD VTP Latch-up: ∆ VDD > − ∆ VD Vbe D Sets strong upper limit on |dVDD/dt| • Cannot gate clock in evaluation state. • Tri-state busses fail similarly x Use hold circuit. Low Power Design Essentials 10. DVS System Transient Response Ring oscillator ( for |dVDD/dt| = 20 V/µ sec) 4 3 2 1 0 6 0 8 0 10 0 12 0 0.6 µ m CMOS VD D fCL K 16 18 14 0 0 Time (ns) 0 20 0 22 0 24 0 26 0 Output fCLK instantaneously adapts to new VDD. Low Power Design Essentials [Ref: T. Burd, JSSC’00] 10. Relative Timing Variation Delay relative to ring oscillator +4 0 Percent Delay Variation Four extreme cases of critical paths: +2 0 0 Seri on es 4V 3V 2V VDD 2V TH TH TH 0T § Delay for all components varies monitonically with VDD H §Timing verification only needed at min & max VDD. Low Power Design Essentials [Ref: T. Burd, UCB’01] Gat e Ring Interconn oscillator ect Diffusi 10. Delay Sensitivity ∂Delay ∂Delay ∆VDD = ⋅ , Delay ∂VDD Delay (VDD ) 1 0 . 0 8 . 0 6 . 0 4 . 0 2 V 2 VDD 3 T V V § Sensitivity max at 2 VTH H T T § Local power grid only needs to be verified at VDD H H Normalized Delay / Delay Low Power Design Essentials ∆VDD = I (VDD ) ⋅ R 4 V T 2VTH H 10. [Ref: T. Burd, UVB’01] Adapative Body Biasing (ABB) § § § Similar to DVS, transistor thresholds can be varied dynamically during operation using body biasing Extension of DBB approach considered for standby leakage management Motivation: – – – – Extends dynamic E-D optimization scope (as a function of activity) Helps to manipulate and control leakage Helps to manage process and environmental variability (especially VTH variations) Is becoming especially important for low VDD/VTH ratios Low Power Design Essentials 10. Threshold Variability and Performance 1 . 4 1 . 3 VTHnom = 0.325V VDD = 0.45V VDD = 0.6V VDD = 1.V normalized delay 1 . 2 1 . 1 1 0 . 9 0 . 8 0 .0.02 7 5 0. 02 0.01 5 0. 01 0.00 5 90 nm CMOS ∆ (V 0 T H ) (V Delay variation at 1V and 0.45V are 13%) and 55%, respectively Low Power Design Essentials 10. 0.0 05 0. 0 1 0.0 15 0. 0 2 0.0 25 Self-Adjusting Threshold Voltage Scheme (SATS) Leakage Sensor ON/OFF Well-Bias Circuit Vwell VD D VD D ON/OFF Vw ell V G Low VTH → large leakage → SSB ON → VBB↓ → High VTH High VTH → little leakage → SSB OFF → VBB↑ → Low VTH Low Power Design Essentials [Ref: T. Kobayashi, CICC94] 10. SATS Experimental Results © IEEE 1994 Low Power Design Essentials [Ref: T. Kobayashi, 10. Adaptive Body Bias ─ Experiment Multiple subsites 5.3 mm PD & Counter Resistor Network 4.5 mm Delay Resistor Network CUT Bias Amplifier © IEEE 2002 Technology # of subsites per die Subsite size Body bias range Bias resolution 150 nm CMOS 21 1.6 x 0.24 mm 0.5V FBB to 0.5V RBB 32 mV [Ref: J. Tschanz, 10. Low Power Design Essentials Adaptive Body Bias ─ Results t o le o ak y Number of dies t o sl o o w f ta rg et A B B F B B Frequen cy within die ABB RB B f tar get noBB 10 0% ABB 97% highest bin 100% yield Accepted die 6 0 % 2 0 %0 % For given frequency and power density • 100% yield with DBB • 97% highest freq bin with ABB for within die variability Low Power Design Essentials low frequency bin high frequency bin [Ref: J. Tschanz, 10. Advantage of Adaptive Biasing at Low VDD/VTH 5 0 4 5 4 0 Eswitching (fJ) 3 5 3 0 2 5 2 0 1 5 0 Adaptive Tuning Worst Case, w/o Vth tuning Worst Case, w/ Vth tuning Nominal, w/o Vth tuning Nominal, w/ Vth tuning 12 x VDD: 200-500mV 1 5 1.0E+ 03 1.0E+ 04 1.0E+ 05 1.0E+ 06 1.0E+ 07 Path Delay (ps) VTH tuning allows operation at nominal conditions Low Power Design Essentials [Courtesy: K. Cao, 10. Combining DVS and ABB © IEEE 2002 Low Power Design Essentials [Ref: M. Miyazaki, ISSCC’02] 10. Adapting VDD and VTH 14 0 12 0 10 0 8 0 6 0 4 0 2 0 0 0 © IEEE 2002 180 nm CMOS Power (mW) Dynami c Voltage Scaling Adaptive Supply And Body Bias 1 0 2 3 0 0 Frequency 4 0 5 0 6 0 10. Low Power Design Essentials (MHz) [Ref: M. Miyazaki, ISSCC’02] Combining DVS and ABB © IEEE 2003 Low Power Design Essentials [Ref: T. Chen, 10. A Generalized Self-Adapting Approach Motivation: Most variations are systematic or slowly varying, and can be measured and adjusted for on a periodic base Parameters to be measured: temperature, delay, leakage • Parameters to be controlled: VDD, VTH (or VBB) • Sensors Tclo ck Controller VBB, VDD Module Achieves the maximum power saving under technology limit • Inherently improves the robustness of design timing • Minimum design overhead required over the traditional design methodology • Low Power Design Essentials 10. Aggressive Deployment (AD) § § Also known as “Better-than-worst-case (BTWC) design” Observation: – Current design targets worst case conditions, which are rarely encountered in actual operation Operate circuits at lower voltages level than allowed by worst case and deal with the occasional errors in other ways Example: Operate memory at voltages lower than allowed by worst case, and deal with the occasional errors through error-correction § Remedy: – Histogram of 32K SRAM cells 60 00 50 00 40 00 30 00 20 00 10 00 0 1 0 0 2 0 DRV 0 (mV) 3 0 0 4 0 0 Aggressive Deployment Distribution ensures that errorrate is low 10. Low Power Design Essentials Aggressive Deployment ─ Concepts § Probability of hitting tail of distribution at any time is small – Function of critical path distribution, input vectors and process variations Supply voltage set to worst case (+ margins) § Worst-case design expensive from energy perspective – § Aggressive deployments scales supply voltage below worst-case value – – “Better-than-worst-case” design strategy Uses error detection and correction techniques to handle rare failures Low Power Design Essentials 10. Aggressive Deployment ─ Components Must include the following components: § Voltage-setting mechanism – Distribution profile learned through simulation or dynamic learning Simple and energy-efficient detection is crucial Since errors are rare, its overhead is only of secondary importance § Error Detection – § Error Correction – Concept can be employed at many layers of the abstraction chain (circuit, architecture, system) VD D VDD setting Error Count Module Correction Low Power Design Essentials ErrorDetection 10. Error Rate versus Supply Voltage Example: 18x18 bit Multiplier @ 90 MHz on FPGA (using random input patterns) 35% energy savings with 1.3% error 22% saving Low Power Design Essentials [Courtesy: T. Austin, U. 10. Error Rate versus Supply Voltage Example: Kogge-Stone adder (870 MHz) (SPICE Simulations) with realistic input patterns 200 mV Low Power Design Essentials [Courtesy: T. Austin, U. 10. AD@Circuit Level ─ Razor Logic Main FF Main FF ME M Shadow Latch clk_d el clk clk_del cl k cl k § Error Detection – – Double-sampling latches (latch + shadow latch) detect timing errors Second sample is correct-by-design Microarchitectural support restores state Timing errors treated like branch miss-predictions § Error Correction – – § Challenges: metastability and short-path constraints Low Power Design Essentials [Ref: D. Ernst, Micro’03] 10. Razor: Distributed Pipeline Recovery IF PC Razor FF err or ID Razor FF err bubbl e or EX bubbl e Razor FF err or (readRazor FF only) bubbl err e or recov er flushID ME M (reg/m em) WB Stabilizer FF bubbl e recov er recov er flushID recov er flushID Flus h Con trol © IEEE 2003 flushID 3.3mm § § § IF ID EX WB Dcache Low Power Design Essentials [Ref: D. Ernst, Micro’03] 3.0mm 10. Builds on existing branch prediction framework Multiple cycle penalty for timing failure Scalable design as all communication is local Icache RF MEM Razor: Voltage Setting Mechanism Ediff = Eref Esample Er ef Edi ff Voltag e Contro l Functi on Voltage VD Regula D tor reset Pipeline . . . J Esam ple § Energy reduction can be realized with a simple proportional control function – Control algorithm implemented in software error signals Low Power Design Essentials [Ref: D. Ernst, Micro’03] 10. Energy/Performance Characteristics Pipeline Throughput Energy IPC Total Energy, Etotal = Eproc + Erecovery © IEEE 2003 Energy of Processor Operations, Eproc Energy of Processor w/o Razor Support Optimal Etotal Energy of Pipeline Recovery, Erecovery Decreasing Supply Voltage 1 % performance impact, 50 % energy reduction Low Power Design Essentials [Ref: D. Ernst, Micro’03] 10. The Industrial Experience Under typical case conditions all chips are at least 39% more energy efficient - Worst-case design uses margins for corners that are very infrequent, or even impossible § Typical-case operation requires an understanding of when and how systems break! § Low Power Design Essentials [Courtesy: K. Flautner, ARM 10. Aggressive Deployment at the Algorithm Level x[n] Main Block y a [ n] >Th ˆ y[n] Estimator ye [ n] Picture 38 § § § § Main Block aggressively scaled in voltage Error detection: Estimator provides upper and lower bounds for output y Error correction: Estimator bounds used when output of Main Block falls outside. Mostly applicable to signal processing and communication applications where Low Power Design Essentials [Ref: B. Shim, 10. Example: Motion Estimation for Video Compression Up to 60% power savings using AD, 6X reduction in PSNR variance in presence of process variations error-free 23.95dB with errors 22.44 dB error-corrected 23.54 dB Low Power Design Essentials [Ref: G. Varatkar, 10. Other Better-Than-Worst-Case Strategies © IEEE 2004 § Self-Tuning Circuits [Kehl93] – – Early work on dynamic timing error avoidance Adaptive clock control § Time Based Transient Fault Detection [Anghel00] – Double sampling latches for speed testing § § Going beyond worst-case specs with TEAtime [Uht00] On-Chip Self-Calibrating Communication IEEE Computer Magazine, March 2004. Low Power Design Essentials 10. Power Domains (PDs) Introduction of multiple voltage domains on single die creates extra challenges: § Need for multiple voltage regulators and/or voltage up-down converters § Reliable distribution of multiple supplies § Interface circuits between voltage domains § System-level management of domain modes – Trade-off gains of changing power modes with overhead of doing so Low Power Design Essentials 10. Power Manager (PM) Time subsystem Power N etwor k Power D omain A Power D omain B Power D omain C PIF PIF Agent PIF Agent PIF Agent PIF Power N etwork Interface Command/ Event Dispatcher Clock subsystem PIF Power subsystem PM: Centralizes power control Power subsystem – gates block power rails § Clock subsystem – gates block clocks § Timer subsystem – system time-wheel and wake-up timers Standardized interface (PIF) between PM and Power Domains § Low Power Design Essentials [Ref: M. Sheets, VLSI’06] 10. Managing the Timing § Basic scheduling schemes – Reactive § § Sleep when not actively processing Wake up in response to a pending event Sleep if idle and probably not needed in near future [Simunic’02] Wake up due to expected event in the near future – Stochastic § § § Metrics – – – Correctness – PD awake when required to be active Latency – time required to change modes Efficiency – minimum total energy consumption [Liao’02] § Minimum idle time – time required for savings in lower-power mode to offset energy spent switching modes Eoverhead − Pidlet switch _ modes Elost Min. Idle Time = = Psavings Psleep − Pidle Low Power Design Essentials 10. Interfacing between Power Domains Separate internal logic of block from its interfaces 1. Communicate with other PDs by bundling related signaling into “ports” – – Communication through a port requires permission (session-based) Permission is obtained through power control interface Can force to a known value (e.g. the non-gated power rail) Can perform level conversion Port A Signal Block wall Power control interface !sleep In sleep Port B !open open Out sleep 2. Signal wall maintains interface regardless of power mode – – Signal wall Interface for block with two ports Low Power Design Essentials Example signal wall schematic (Port) 10. Example: PDs in Sensor Network Processor 2.7x2.7 mm2 (130 nm CMOS) Clock Rates Supply clk osc serial. volt if conv. 8 MHz – 80 KHz 0.3-1V 53 µ W 150 µ W 5 mW Leakage Power Average Power Peak Power 64kB code/data RAM location dw 8051 µP 1kB TX/RX queues DLL baseband PM neighbor 1200 RX listen windows TX broadcast packet Power (µ W) 766 © IEEE 2006 60 i basebf and ser ial neigh bor locat ion que ues dw8 051 d l l Low Power Design Essentials [Ref: M. Sheets, VLSI’06] Sleep signals 10. Integrated Switched-Capacitor Voltage Converter C l k C l k 10 pF 10 pF 10 pF 10 pF 10 pF C l k Rl o a d C C l k Equalizing C 1 V C l k C C C phase C Charging phase R l o a d C C C C Rl o a d 85% efficiency at 1V when optimized for load Output voltage ripple function of Rload and fClk Low Power Design Essentials [Ref: H. Qin, ISQED’04] 10. Integrated Power Converter for Sensor Networks Solar Cell Electromagnet icha S ker Piezoelectr icen B der Thermoelectr icenera G tor Ni C MH ell (1.2 V) LiI C on 3.6 ell ( V) Ultracapaci tor Microcontroller + sensors 1:2 converter Level shifters Microcontrol ler Ra dio Sens ors Mem ory 3:2 converter Integrated power manager Switched-capacitor converters operate at high-efficiency (up to 80%) at low current levels 10. Ra dio Low Power Design Essentials [Ref: M. Seeman, CICC’07] LC-based DC-DC (buck) converter VDD, High Voltage Controlle r Output filter LF RS VDD,INTE RNAL CF Load © IEEE 2006 Challenge: • Need good low-resistive L & high-capacitive C • Hard to achieve on-chip • Option: Use multi-chip stacking PA D 2m m C ap 1 n F [Ref: K. Onizuka, A10. Low Power Design Essentials Revisiting Power Distribution Concept § § Current On-Chip Power Distribution: Single metal grid extended with switches for power gating Need for: higher voltage distribution and integrated level converters and switches Hard to accomplish on standard integrated circuit Opportunities offered by stacked dies and 2.5D integration § Thicker wires, better inductors and capacitors Towards “PGE on a chip” § § § Low Power Design Essentials [Ref: J. Rabaey, 10. Revisiting Power Distribution L&C cell array Power supply & other wires Inductors Capacitors THV’s Interposer Sensor, MEMS, High voltage generation, Analog, RF etc. (3D stacked) Parallel processors with own DC-DC converters Embedded in interposer Pads & bumps Package Stacked memories (thinned) Base chip Low Power Design Essentials [Ref: K. Onizuka, JSSC’07] 10. Summary § § § § § Power and energy optimality a function of operational parameters Run-time power optimization tracks changes in activity and environmental conditions to dynamically set supply and threshold voltages Aggressive deployment scales supply voltage below the traditional worst-case and uses error-detection/correction to deal with rare errors. Interesting idea: errors are not always fatal and can be allowed under certain conditions Challenge: Integrated power management and distribution supporting dynamic variations Low Power Design Essentials 10. Literature Books, Magazines, Theses § T. Burd, Energy-Efficient Processor System Design,” http://bwrc.eecs.berkeley.edu/Publications/2001/THESES/energ_eff_process-sys_des/index.htm, UCB, 2001. Numerous authors, Better than worst case design, IEEE Computer Magazine, March 2004. T. Simunic: "Dynamic Management of Power Consumption", in Power-Aware Computing, edited by R. Graybill, R. Melhem, Kluwer Academic Publishers, 2002. A. Wang, Adaptive Techniques for Dynamic Processor Optimization, Springer, 2008. § § § Articles § L. Anghel and M. Nicolaidis, “Cost reduction and evaluation of temporary faults detecting technique,” Proc. DATE 2000, pp. 591–598, 2000. T. Burd,T. Pering, A. Stratakos, R. Brodersen; A dynamic voltage scaled microprocessor system, IEEE Journal of Solid-State Circuits, vol. 35, pp. 1571 - 1580, November 2000. T. Chen and S. Naffziger, “Comparison of adaptive body bias (ABB) and adaptive supply voltage (ASV) for improving delay and leakage under the presence of process variation,” Trans. VLSI Systems, Vol 11, Isuse 5, pp. 888-899, Oct 2003. D. Ernst et al, “Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation,” Micro Conference, December 2003. V. Gutnik,, A. P. Chandrakasan, "An Efficient Controller for Variable Supply Voltage Low Power Processing," IEEE Symposium on VLSI Circuits, pp. 158-159, June 1996. T. Kehl, “Hardware self-tuning and circuit performance monitoring,”: Proceedings ICCD 1993. T. Kobayashi, T. Sakurai; Self-adjusting threshold-voltage scheme (SATS) for low-voltage high-speed 10. § § § § § § Low Power Design Essentials References (cntd) § T. Kuroda et al., “Variable Supply-Voltage Scheme for Low-Power High-Speed CMOS Digital Design”, IEEE J. Solid-State Circuits, vol. 33, no. 3, pp. 454-462, Mar. 1998. W. Liao, J. M. Basile, and L. He, “Leakage power modeling and reduction with data retention," in Proceedings IEEE ICCAD, pp. 714-19, San Jose, Nov. 2002. M. Miyazaki, J. Kao, A. Chandrakasan, "A 175mV Multiply-Accumulate Unit Using an Adaptive Supply Voltage Voltage and Body Bias (ASB) Architecture," IEEE ISSCC, pp. 58-59, San Francisco, California, February 2002. L. Nielsen, C. Niessen; Low-power operation using self-timed circuits and adaptive scaling of the supply voltage, IEEE Transactions on VLSI Systems, pp 391-397, December 1994. H. Okano, T. Shiota, Y. Kawabe, W. Shibamoto, T. Hashimoto, and A. Inoue, "Supply voltage adjustment technique for low power consumption and its application to SOCs with multiple threshold voltage CMOS," Symp. VLSI Circuits Dig. , pp. 208 - 209, June 2006. K. Onizuka, H. Kawaguchi, M. Takamiya, and T. Sakurai, “Stacked-chip Implementation of OnChip Buck Converter for Power-Aware Distributed Power Supply Systems,” A-SSCC, Nov. 2006. K. Onizuka, K. Inagaki, H. Kawaguchi, M. Takamiya, and T. Sakurai,IEEE JSSC, accepted, to be published, 2007. T. Pering, T. Burd, and R. Brodersen. “The Simulation and Evaluation of Dynamic Voltage Scaling Algorithms.” Proceedings of Int’l Symposium on Low Power Electronics and Design 1998, pp.76-81, June 1998. H. Qin, Huifang Qin; Yu Cao; Markovic, D.; Vladimirescu, A.; Rabaey, J., "SRAM leakage suppression by minimizing standby supply voltage," Proceedings. 5th International Symposium on Quality Electronic Design, 2004, April 2004. J. Rabaey, “Power Management in Wireless SoCs,” Invited presentation MPSOC 2004, Aix-enProvence, Sept. 20004; http://www.eecs.berkeley.edu/~jan/Presentations/MPSOC04.pdf 10. § § § § § § § § § Low Power Design Essentials References (cntd) § T. Sakurai; Perspectives on power-aware electronics, IEEE International Solid-State Circuits Conference, vol. XLVI, pp. 26 - 29, February 2003. M. Seeman, S. Sanders, J. Rabaey, “An Ultra-Low-Power Power Management IC for Wireless Sensor Nodes”, Proceedings CICC 2007, San Jose, Sept. 2007. A. Sinha, A. P. Chandrakasan, "Dynamic Voltage Scheduling Using Adaptive Filtering of Workload Traces," VLSI Design 2001, pp. 221-226, Bangalore, India, January 2001. M. Sheets et al, "A Power-Managed Protocol Processor for Wireless Sensor Networks," Digest of Technical Papers VLSI06, pp. 212 – 213, June 2006. B. Shim and N. R. Shanbhag, “Energy-efficient soft error-tolerant digital signal processing,” IEEE Transactions on VLSI, 14, 4, 336-348, April, 2006 J. Tschanz et al.; Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage, IEEE International Solid-State Circuits Conference, vol. XLV, pp. 422 - 423, February 2002. A. Uht, “Achieving typical delays in synchronous systems via timing error toleration,” Technical Report TR-032000-0100, University of Rhode Island, Mar. 2000. G. Varatkar, N. Shanbhag, “Energy-Efficient Motion Estimation using Error-Tolerance,” Proceedings ISLPED 06, pp. 113-118, Oct 2006. F. Worm, P. Ienne, P. Thiran, and G. D. Micheli. “An adaptive low-power transmission scheme for on-chip networks,” Proc. of the International Symposium on System Synthesis (ISSS), pages 92– 100, 2002. § § § § § § § § Low Power Design Essentials 10. ...
View Full Document

Ask a homework question - tutors are online