High speed serial links: High-speed Design Trends and Challenges
Vladimir Stojanovi j
Integrated Systems Group Massachusetts Institute of Technology
Backbone router lots of high-speed links
source: Alcatel, Tyco source: Juniper Networks
State-of-the art up to 1 Tb/s throughput Lots of linecards power constrained system
What matters is energy cost per bit
Integrated Systems Group 2
Inside the router
Line Cards: 8 to 16 per System Passive Switch Cards: Backplane 2 to 4 per System
MEM
MEM
MEM
MEM
SerDes
Crossbar
Optics
SerDes
MAC
NPU
TM/ Fabric IF
SerDes
OC-192 12 5Gb/s 12.5Gb/s Laser driver link
4x3.125 Gb/s XAUI Serial Links (chip-to-chip)
3.125-12.5Gb/s Backplane Serial Links
Regardless of where the links are, there is a constant desire to signal faster and with less power
Integrated Systems Group 3
Scaling the throughput to 100 Tb/s
Electrical I/O Challenges
100 Tb/s I/O throughput With 10Gb/s per link 10000 transceivers 20000 high-speed I/O pairs 2 10000 mm in 0 0.13 13 m technology Power 4kW 40 mW/Gb/s energy cost per bit
Integrated Systems Group
Scaling the throughput to 100 Tb/s
Density issues
Connectors 50 diff pairs/inch 400 400 long connector Trace routing 50mils pitch 250 wide 4-signal layer line-card Backplane less critical Package Package/Chip ball pitch (1mm / 200um) 2 2 4000 mm / 160mm
source: Teradyne, Rambus Integrated Systems Group 5
Design challenge
Goal
Fit 100 Tb/s on a 100 W crossbar chip Reasonable system/rack y size
Need
Power
Reduce energy/bit to 1mW/Gb/s Increase data rate per link by 10-15x
Density
Integrated Systems Group
What makes it challenging
High speed link chip
> 2 GHz signals g
Now, the bandwidth limit is in wires
Integrated Systems Group
source: Rambus
High-speed link efficiency energy cost per bit
Energy cost per b bit [mW/Gb/s]
How efficient are high-speed links?
1000000 100000
18 16 14 12 10 8 6 4 2 0
PAM4 PAM2 5.5 4
Energy co ost per bit mW/(G Gb/s)
5.9 8 1.5 1 0.45 03 0.3 RxTap 2.2
11
10000
1000
TxTap
RxSamp
PLL
CDR
En nergy cost per bit t [mW/Gb/s]
100
140 120 100 80 60 40 20 0 0
PAM2 PAM2 PAM2 PAM4 PAM4 Tx5 Rx20 Tx5 Rx1+20 Tx50 Rx80 Tx5 Rx20 Tx50 Rx80
10
1 56Kb/s V.92 modem 12x12Mb/s ADSL modem Gigabit Ethernet 10Gb/s High-speed link
2-3 orders more energy-efficient
Than traditional wireline systems
8 10 12 14 Data rate [Gb/s]
16
18
20
Starting to pa pay the price for band band-limited limited channels
Integrated Systems Group 8
Outline
Show the path to efficient 100 Tb/s systems
Look at all aspects of system design
High-speed link environment Improving the channel What can chips do?
Integrated Systems Group
Backplane environment
Package On-chip parasitic Line card trace
(termination resistance and device loading capacitance)
Package via
Back plane trace
Back plane connector
Line card via
Backplane via
Line attenuation Reflections from stubs (vias)
Integrated Systems Group 10
Backplane channel
Loss is variable
Same backplane p Different lengths Different stubs
Atte enuation [d dB]
0 -10 -20 -30 -40 -50 50 -60 9" FR4, via i stub t b 26" FR4, via stub 2 4 6 8 10 frequency [GHz] 26" FR4 9" FR4
Top p vs. Bot
Attenuation is large
>30dB @ 3GHz But is that bad?
Required signal amplitude set by noise
Integrated Systems Group
11
Interference
Attenu uation [dB] 0
pulse re esponse
-10 -20 20 -30 -40 -50 50 -60
THROUGH
1 0.8 0.6 0.4 0.2 0 Tsymbol=160ps
NEXT
FEXT
6 8 10 frequency [GHz] f [GH ]
3 ns
Inter-symbol interference
Dispersion spe s o (s (skin-effect, e ec , d dielectric e ec c loss) oss) - s short o latency a e cy Reflections (impedance mismatches connectors, via stubs, device parasitics, package) long latency
Co-channel Co channel interference (Far-End (Far End & Near-End Near End Crosstalk)
Integrated Systems Group 12
Reflections and Crosstalk
Dont just receive the signal you want
Get versions of signals close to you Vertical connections have worst coupling
Close in these vertical connection regions
Far-end XTALK (FEXT) Desired signal g
Sercu, DesignCon03
Reflections Near-end XTALK (NEXT)
Integrated Systems Group
13
A complex system
PCB only
PCB + Connectors
PCB, Connectors, Via stubs & Devices
Integrated Systems Group
14
Outline
Show the path to efficient 100 Tb/s systems
Look at all aspects of system design
High-speed link environment Improving the channel What can chips do?
Integrated Systems Group
15
Dispersion: material loss
FR4 dielectric, 8 mil wide and 1m long 50 Ohm strip line 1 Atten nuation 0.8 0.6 0.4 0.2 0 1.0E+06 1.0E+07 1.0E+08 Frequency, Hz
Total loss Conductor loss Dielectric loss
1.0E+09
1.0E+10
Kollipara DesignCon03
PCB Loss : skin & dielectric loss Skin Loss f Dielectric loss f : a bigger issue at higher f
Integrated Systems Group 16
Better dielectric
Rogers
FR4 FR4+stubs
source: Alcatel, Tyco
Rogers is expensive but smallest loss
Integrated Systems Group 17
Minimizing reflections - the vias
Minimizing g via stubs
Thinner PCBs are better but sometimes impossible Counter-boring Counter boring Blind vias SMT technology All are costly 1.1x - 2x
plated through hole
counter-bored Integrated Systems Group
blind via 18
Connector technologies
LC BP
microvia
trace
Standard - Press-Fit Side-Interface (Tyco)
Surface-mount + microvia Orthogonal - Teradyne (Differential Plated Through Hole)
Stubs big problem in standard press press-fit fit connectors Side-Interface eliminates DC stubs and diff-pair length mismatch Orthogonal interconnect DPTH eliminates the backplane Surface-mount
Integrated Systems Group 19
Eliminating the backplane - orthogonal interconnect
source: Teradyne
No backplane trace No backplane via-stub Coax-like shielding and diff-pair matching in DPTH mid-plane
M. Cartier et al Optimized Signal Path for Orthogonal System Architectures, DesignCon 2005. Integrated Systems Group 20
DPTH connector performance
No shared vias (non-DPTH) Shared vias (DPTH)
Insertion Loss of DPTH very small Reflections minimized NEXT and FEXT minimized
Integrated Systems Group 21
Outline
Show the path to efficient 100Tb/s systems
Look at all aspects of system design
High-speed link environment Improving the channel What can chips do?
Integrated Systems Group
22
New link design
Dealing with bandwidth limited channels
This is an old research area
Textbooks on digital communications Thi k modems, Think d DSL Standard approach requires high-speed high speed A/Ds and digital signal processing 20Gs/s A/Ds are expensive
But cant directly apply their solutions
(Un)fortunately need to rethink issues
Integrated Systems Group
23
Baseline Channels
0
Short ATCA BP, 3
-20 20
S21 [dB]
-40
N6K BP, 26
-60
Legacy FR4 BP 26, via stub
-80
-100
5 10 frequency [GHz]
15
Legacy (FR4) - lots of reflections Microwave engineered (N6K) Emerging standards (IEEE 802.3ap, ATCA)
Integrated Systems Group 24
Capacity and MT data rates the impact of noise
Capacity thermal and phase noise
220 200 180 160 Capacity y [Gb/s] 140 120 100 80 60 40 20 0 0 5 10 15 Noise factor [dB] 20
Uncoded MT thermal and phase noise
220 200
Short ATCA BP
180 160
N6K BP
Data rate e [Gb/s]
140 120 100 80 60 40 20 0 0
Short ATCA BP N6K BP
Legacy FR4 BP
Legacy FR4 BP
5 10 15 Noise factor [dB] 20
Capacity
Uncoded MT
Much higher than data rates in todays links Noise
Half the capacity
BER target of 10-15 Peak-power constraint
Thermal - 50Ohm termination Phase noise best LC PLL (0.14%UI rms)
Coding can help
Integrated Systems Group
25
Removing ISI baseband link
Linear transmit equalizer
Tx Data
Anticausal taps
Sampled Data
Deadband
Feedback taps
Channel
50 50 outP outN
Causal taps
d
TapSel TapSel Logic
I eq 0
Decision-feedback equalizer
Transmit and Receive Equalization
Changes signal to correct for ISI Often easier to work at transmitter
DACs easier than ADCs
Integrated Systems Group 26
J. Zerbe et al, "Design, Equalization and Clock Recovery for a 2.5-10Gb/s 2-PAM/4-PAM Backplane Transceiver Cell," IEEE Journal Solid-State Circuits, Dec. 2003.
Pulse amplitude modulation
Binary (NRZ)
1 bit / symbol Symbol rate = bit rate
PAM4
2 bits / symbol Symbol rate = bit rate/2
00 01 11 10
1 0
Integrated Systems Group
27
Multi-level: offset and jitter are crucial
th thermal l noise i
45 Data rat te [Gb/s]
Data rat te [Gb/s] 30 25 20 15
PAM4 PAM2
PAM8
thermal noise + offset ff t
30 Data rate [Gb/s] 25 20 15 10
thermal noise + offset+ jitt jitter
40 35 30 25 20 15 10 5 0 0 2 4 6 8 10 12 14 16 18 20 S Symbol b l rate t [Gs/s] [G / ]
PAM2 PAM16 PAM8
PAM16 PAM4
PAM4 PAM2 PAM8
10 5 0 0 2 4 6 8 10 12 14 16 18 20 S b l rate [Gs/s] Symbol [G / ]
5 0 0 2 4 6 8 10 12 14 16 18 20 Symbol rate [Gs/s]
To make better use of available bandwidth, need better circuits PAM2/PAM4 2/ robust candidate for f next generation links
Integrated Systems Group 28
Full ISI compensation too costly
thermal noise
Data a rate [Gb/s] 20 18 16 14 12 10 8 6 4 2 0 0 2 4 6 8 10 12 14 16 Symbol rate [Gs/s]
PAM2 PAM8
PAM4
thermal noise + offset
20
Data rate [Gb/s]
thermal noise + offset+ jitter
20 Data rate [Gb/s] 18 16 14 12 10
PAM4
18 16 14
12 PAM16 10 PAM8 8 6 4 2 0 0 2 4 6 8 10 12 14 16 Symbol rate [Gs/s]
PAM4 PAM2
8 6 4 2
PAM8
PAM2
0 0 2 4 6 8 10 12 14 16 Symbol rate [Gs/s]
Todays links cannot afford to compensate all ISI
Too much power Limits todays maximum achievable data rates
Integrated Systems Group 29
Capacity Bit Loading
Excess Noise factor 0dB
8 7 6 # bits pe er dimension 5 4 3 2 1 0 Legacy FR4 BP 0 5 10 Frequency [GHz] 4.5
Excess Noise factor 20dB
4 3.5
Short ATCA BP
N6K BP
# bits pe er dimension 15
3 2.5 2 1.5 1 05 0.5 0 0 5 10 Frequency [GHz] 15
B d idth i Bandwidth is li limited it d b by attenuation tt ti and d noise i
Cant just keep increasing the signaling frequency Need to focus on available bandwidth (at most 10-20GHz)
Need circuits that can create/sense 4-8 bits/dim
Integrated Systems Group 30
Uncoded Multi-tone Bit Loading
Excess Noise factor 0dB
5 4.5 4 # bits per d dimension # bits per d dimension 3.5 3 25 2.5 2 1.5 1 0.5
Excess Noise factor 20dB
2 1.8
1.6
Short ATCA BP
N6K BP
1.4 1.2 1 0.8 0.6 0.4 0.2
Legacy FR4 BP
0 0 5 10 Frequency [GHz] 15
0 0
5 10 Frequency [GHz]
15
Integer constellations and target BER=10-15
Bandwidth not affected much (still 10-20GHz)
In high-noise g case - less advantage g over baseband With coding can improve by up to 2x closer to capacity
Integrated Systems Group 31
Impact of jitter on baseband
Legacy FR4 BP
0 -2 12Gb/s
Short ATCA BP
0 -2
-4 log10BER -6 -8 -10 -12 -14 0
10Gb/s
25Gb/s 8Gb/s
log10 BER 0 -4 -6 -8
20Gb/s
-10 -12
15Gb/s
6Gb/s
5 10 Jitter Factor [dB] 15 20
-14 0 5 10 Jitter Factor [dB]
10Gb/s
15 20
With proper coding
Increase data rate Relax PLL jitter spec save power
Integrated Systems Group 32
Original jitter rms = 1.4%UI (ring oscillator based PLL)
BER vs. hardware complexity
Legacy FR4 BP
0
0
Short ATCA BP
12Gb/s
-5 log10B ER
log10BE ER
-5
-10
10Gb/s
-10
25Gb/s
6Gb/s
-15 0 10 20
8Gb/s
30 40 50 60 # feedback eq q taps p 70 80
-15 0
15Gb/s
10
20Gb/s
20 30 # feedback eq taps 40
Partially eliminate ISI (leave most of the reflections) Let simple code take care of the rest
5 Can recover from raw BER of 10-5 And save up to 50 feedback taps - up to 15mW/Gb/s in 0.13m
Integrated Systems Group
33
But, need to be careful
Always now what youre optimizing
Powerful coders/encoders often costly
Example - fastest RS (255,239) implementation 10 40 Gb/s throughput Energy cost - 12mW/Gb/s 50x area of the high-speed link (extensive parallelism)
Need to include the energy cost per bit in the code d d design i spec
L. Song, M-L Yu, M.S. Shaffer, 10- and 40-Gb/s Forward Error Correction Devices for Optical Communications, IEEE Journal of Solid-State Circuits, vol. 37, no. 11, Nov. 2002.
Integrated Systems Group
34
Opportunity for coding
Break the coding/equalization/modulation hierarchy Goal to minimize overall energy cost per bit Proper coding can be more energy-efficient in achieving the low BER than modulation/equalization
Especially with lots of crosstalk and numerous small reflections
Need new paradigms in code development to specification
Non Gaussian (system) noise Non-Gaussian Circuit non-idealities Crosstalk and residual channel memory (ISI) E Energy cost t constraint t i t on code d performance f
Integrated Systems Group 35
Bridging the gap: Multi-tone link
10
Multi-tone data rates with thermal noise
8
#bits/H Hz
Nelco 64Gb/s FR4 38Gb/s
0 0
10
12
14
frequency [GHz]
A. Amirkhany, V. Stojanovic, M.A. Horowitz, Multi-tone Signaling for High-speed Backplane Electrical Links, IEEE Global Telecommunications Conference, November 2004.
Integrated Systems Group
36
Bridging the gap: Multi-tone link
10
Multi-tone data rates with thermal noise
8
#bit ts/Hz
Nelco 64Gb/s FR4 38Gb/s
data0 data1 LPF
LPF BPF
LPF
2 4 6 8 10 12 14
frequency [GHz]
data0 d t 1 data1
0 0
BPF
LPF
dataN LPF BPF
# level ls
ejw1t
ejw1t
BPF
LPF
dataN
ejwNt f ejwNt Challenge balancing the inter-symbol and inter-channel interference
Microwave filter techniques Custom signal processing
Integrated Systems Group 37
Conclusions
Interfaces are challenging system designs
Good space to explore system level optimization 2-3x improvement in data rate possible
Better backplanes are around the corner
State-of-the-art baseband links (chips)
Far from utilizing the capacity of the channels
10-20x difference in data rates
Looking into multi-tone and coding to bridge the gap Useful channel bandwidth 10-20 GHz
F Focus on lower-speed l d precision i i circuits i it f for hi higher h order d constellations t ll ti
Coding
If careful, can lower the energy cost per bit for the whole system Problem formulation different in so many ways
Integrated Systems Group 38
Acknowledgments
MARCO Interconnect Focus Center Jared Zerbe and Ravi Kollipara - Rambus John DAmbrosia Tyco IEEE 802.3ap, p, ATCA forum Alcatel, Teradyne, Juniper Networks
Integrated Systems Group
39