guest1 - Low-Power High-Speed Links Gu-Yeon Wei Division of...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Low-Power High-Speed Links Gu-Yeon Wei Division of Engineering and Applied Sciences Harvard University 6.976 Guest Lecture, Spring 2003 Outline Motivation Brief Overview of High-Speed Links Design Considerations for Low Power DVS Link Design Example Summary Wei Low-Power High-Speed Links 2 Motivation Demand for high bandwidth communications Advancements in IC fabrication technology Higher performance More complex functionality Chip I/O becomes performance bottleneck Increasing power consumption Network router example network card switch card digital crosspoint transceivers backplane PCB Wei 10's to 100's of links on a single crosspoint switch Low-Power High-Speed Links 50- traces 3 High-Speed Links Overview High-speed data communication between chips across an impedance controlled channel Shared communication bus (memories) Point-to-point links Types of link architectures and implementations Parallel vs. serial Differential vs. single-ended Low-impedance vs. high-impedance driver Transmitter-only vs. receiver-only vs. double termination We will focus on point-to-point serial links using differential highimpedance drivers with double termination for network routers Techniques to reduce power applicable to other link types Wei Low-Power High-Speed Links 4 Link Components High-speed links consist of 4 main components data out 10010 TX RX channel timing recovery Serializing transmitter driver Communication channel De-serializing receiver samplers Timing recovery Wei Low-Power High-Speed Links data in 5 10010 Performance Limitations Important to look at performance because higher performance can lead to lower power by trading off performance for energy reduction Several factors limit the performance of high-speed links Non-ideal channel characteristics Bandwidth limits of transceiver circuitry Noise from power supply, cross talk, clock jitter, device mismatches, etc. Eye diagrams a qualitative measure of link performance Tbit random bit stream ideal data eye Wei realistic data eye 6 Low-Power High-Speed Links Channel Impairments and ISI One of the dominant causes of eye closure is inter-symbol interference (ISI) due to channel bandwidth limitations Two ways to view channel impairments Wei Low-Power High-Speed Links 7 Equalization Placing a high-pass filter in the signal path can counter the rolloff effects of the channel Preemphasis or transmit-side equalization is commonly used Wei Low-Power High-Speed Links 8 Critical Path in Links The critical path in links can be as short as 1~2 gate delays Transmit path Rterm D Q Zo=50 D Q Receive path Rterm Zo=50 Deven D Q Vref Dodd D Q Wei Low-Power High-Speed Links 9 Clock Frequency Limit While a symbol time can be short, there is a limit to the maximum on-chip clock frequency Must distribute a clock driven by a buffer chain CLKIN CLKOUT Wei Overcome this limitation with parallelism Low-Power High-Speed Links 10 Parallelism Parallelism can increase bit rate even with limited clock frequency Time-interleaved multiplexing Multi-level signaling clk[n] clk[n+1] data 10 11 01 00 Some performance issues to be wary of Static timing offsets in multi-phase clock generator (DLL or PLL) Requires higher voltage dynamic range in transmitter and receiver Wei Parallelism can also be low power Low-Power High-Speed Links 11 Sources of Noise Power supply noise Translates into voltage and timing uncertainty Cross talk Near- and Far-End Cross Talk (NEXT and FEXT) High-frequency coupling Clock Jitter Timing uncertainty in transmitted and sampled symbol Probabilistic distribution of timing edges (bounded and unbounded components) Device mismatches and systematic offsets Deterministic or systematic variation in timing edges from multi-phase clock generators Wei Low-Power High-Speed Links 12 Considerations for Low Power Low noise low power Target some signal to noise ratio (SNR) Reducing noise allows for lower signal power Trade speed for lower power Reducing bit rate improves SNR Many noise sources are fixed ratio of timing uncertainty to bit time improves (have longer bit times) Let's look at a few design choices for low power Circuit level Architecture level System level Wei Low-Power High-Speed Links 13 Offset Calibration Two sets of offsets that can manifest itself as voltage and timing uncertainty to close the eye and may require higher power to overcome them: Multi-phase clock generator timing offsets Receiver input voltage offsets ideal Tx timing higher margins zero Rx offset non-zero Rx offset w/ offsets lower margins sampling edge Static offsets due to systematic (layout) and random (device) mismatches Calibration enables more timing and voltage margins (i.e., lower noise) Wei Low-Power High-Speed Links 14 Differential Signaling Differential communication can lead to a lower power solution Immunity to common mode noise Injects less noise into the supplies Signal amplitudes can be smaller on both channels Alternative is pseudo-differential signaling but needs a reference voltage which can be noisy and require larger Vswing Rterm Zo=50 But aren't there now are two channels that switch? Yes, but... Rterm D Vswingdiff pk2pk = 2(I*R) Rterm Vswing = I*R Zo=50 Vref Zo=50 What does it cost? Requires two pins per link Wei Low-Power High-Speed Links 15 Signal Multiplexing There are different options for choosing where to combine pulses to create sub-clock period symbols Combine at the final transmitter stage vs. farther up stream Rterm Zo=50 high-speed path Rterm Zo=50 D0 D1 D2 D3 Cload for the clocks higher when combined at the final transmitter vs. Need faster signal path after the multiplexer Wei Best choice depends on implementation (see Zerbe, ISSCC2003) Low-Power High-Speed Links 16 Multiplexing Parallelized Transmitter TX TX TX channel bitrate = M.fclk Parallelized Receiver RX RX RX Multiphase Clocks Wei Low-Power High-Speed Links 17 Multiplexing = Low Power? With M:1 multiplexing, fCLK = bit rate/M Power = MCV2f = MCV2 (BR/M) = CV2BR With fixed supply, power does not vary with M But wait, at lower frequencies, I can lower voltage! With lower supply voltage (V fCLK = BR/M), Power decreases as 1/M2 ! Wei Low-Power High-Speed Links 18 Power vs. M Larger M Can reduce voltage Lower power Less accurate timing static phase offsets jitter Fixed Supply Lower Supply 1/M or fCLK Cannot make M arbitrarily large b/c there is a lower limit to Vdd Choice: M= 4~6 choice This begs the questions... What if we make Vdd adaptive w/ fCLK? Wei Low-Power High-Speed Links 19 DVS Links Dynamic Voltage Scaling (DVS) Technique first introduced for digital systems (e.g., uP, DSP chips) Lot's of work done in both academia and industry (e.g., Intel, Transmeta) Allows trade off between speed and power Let's investigate DVS for high-speed links Motivation and potential benefits Design example from Dr. Jaeha Kim (ISSCC2002, JSSC2002, PhD thesis 2002) Wei Low-Power High-Speed Links 20 DVS Links Dynamic Voltage Scaling (DVS) can reduce power consumption in two ways 1) Digital circuits operate at their most energy-efficient point in the presence of PVT variations by eliminating extra performance margins 1.2 Normalized Frequency 1 0.8 0.6 0.4 0.2 0 Pdynamic = Cswitched V 2F 88% 0 50 130 0.5 V 0.8 V 1 1.5 2 2.5 3 3.5 Supply Voltage (V) Wei Low-Power High-Speed Links 21 Trade Performance for Energy Savings 2) DVS enables trade off between performance and energy Reducing frequency alone reduces power but not energy per bit 1 Normalized energy / bit 0.8 0.6 0.4 Fixed Vdd = 3.3V E C V 2 Energy Savings Dynamically scaled Vdd 0.2 0 0 0.2 0.4 0.6 Normalized bit rate Low-Power High-Speed Links 0.8 1 Wei 22 DVS Link Components DVS links require two additional components Mechanism to measure circuit critical path to appropriately adjust voltage with respect to frequency Use an on-chip performance monitor circuit (inverter delay elements of core DLL) Efficient supply-voltage regulator (buck converter) Overall Block diagram (Wei et al, ISSCC2000) core DLL FCLK VCTRL Digital Controller Buck Converter I/O Transceiver DTX DRX RVdd Wei Low-Power High-Speed Links 23 Performance Monitoring DLL CP 0 A clk Reduces design complexity by enabling one to replace precision analog components with simple digital gates. How? Inverters of the delay line model the critical path (clock distribution) Delay of gates in I/O circuitry are fixed relative to clock period Wei Low-Power High-Speed Links 24 O VCP 180 O 1 UP DN PD VCTRL A Adaptive Receiver Filter Filter signal frequencies beyond the Nyquist rate at the receiver (helps for dealing with cross talk) Receiver example IN IN Vbias 0 AdB Preamplifier fsymbol 1 f Regenerative Latch Filter's corner frequency tracks fsymbol Wei Low-Power High-Speed Links 25 Example: Adaptive Supply Serial Links Jaeha Kim (Ph.D. defense 2002) Multiphase Clock Recovery Multiphase Clock Generation 1:5 Demultiplexing Receiver Adaptive Supply, V 5:1 Multiplexing Transmitter fref Adaptive Power Supply Regulator Wei Low-Power High-Speed Links 26 Multi-Phase Clock Generation Must minimize static offsets between phases Generate multiphase clocks locally at each pin, but watch out for power and area overhead Static Offsets Jitter TX TX TX Wei Low-Power High-Speed Links 27 Dual-Loop Clock Generation Adaptive Power Supply Regulator fref f Coarse Control Reference VCO Adaptive Supply, V Local Multiphase Clock Generation Fine Control Local VCO Local Multiphase Clock Recovery Fine Control Local VCO 1:5 Demultiplexing Receiver 5:1 Multiplexing Transmitter Wei Low-Power High-Speed Links 28 Dual-Loop Clock Generation (2) Global loop brings the local VCO frequency close to lock Narrow local tuning range (+/-15%) is sufficient to compensate for on-chip mismatches Narrow tuning range leads to low VCO gain Small loop capacitor area (2.5pF) Low sensitivity on Vctrl noise Wei Low-Power High-Speed Links 29 Clock Recovery Optimal receiver timing is recovered from the incoming data stream Data RX PD Clock Recovery Filter VCO Wei Low-Power High-Speed Links 30 Phase Detection Phase detector made of an identical set of receivers minimizes timing error RX PD PD RX PD RX PD RX PD RX PD [4:0] 5 Edge Detection / Majority Voting Early/Late /None 5 Data RX Received Data Wei Low-Power High-Speed Links 31 Chip Prototype TX Digital Sliding Controller TX data gen TX TX data gen TX/RX Feedback Biasing Testing Interface RX- PRBS PLL RX RX-DLL RX PRBS 0.25m CMOS 2.5V / 0.55Vth 3.12.9mm2 0.4~5.0Gb/s 0.9~2.5V 5.6~375mW BER < 10-15 Reg. Efficiency: 83-94% Power Transistors Wei Low-Power High-Speed Links TX-DLL TX-PLL 32 Power and Performance 3.1Gb/s, 113mW Wei Low-Power High-Speed Links 33 Power Breakdown Wei Low-Power High-Speed Links 34 Summary Higher performance links require low noise solutions lead to lower power low noise Trade performance (speed) for power reduction DVS links enable energy-efficient link operation and also have some nice properties Outstanding issues with using DVS links Communication during frequency and voltage transitions Supply voltage regulator slew rate limits Overhead of multiple regulators Wei Low-Power High-Speed Links 35 ...
View Full Document

This note was uploaded on 10/29/2011 for the course EE 6.976 taught by Professor Michaelperrott during the Spring '03 term at MIT.

Ask a homework question - tutors are online