### 100G SERDES Power

Phil Sun, Credo

IEEE 802.3ck Task Force

## <u>Introduction</u>

- 100Gbps SERDES power challenge and potential solutions by balancing TX/RX EQ have been presented.
  - <u>sun\_3ck\_01a\_0518</u> introduced "balanced lower-power EQ", training protocol, and silicon test results.
  - <u>healey 3ck 01b 0718</u> pointed out "extensions to TX FFE" can improve margin while keeping low C2M power.
  - <u>welch 3ck adhoc 01 081518</u> concluded power budget for C2M interface is very little for some future modules.
  - <u>lim 3ck 01b 0718</u> showed 8 FFE taps may be needed for C2M and SERDES power is a concern.
- o This contribution is to discuss power of different SERDES architectures.
- O Power optimization may be different for each design. This contribution is mainly based on published papers and general design rules.

# Major Blocks of a Typical SERDES



- High-power blocks are TX driver, RX FFE/DFE, PLL/clock buffers, CTLE. Some SERDES also has ADC.
- o FFE and DFE may be implemented in analog or digital domain depend on whether there is high-precision ADC.

#### SERDES Structure with "Balanced EQ"

- "Balanced EQ" is proposed to move part of the equalization from RX to TX to save power.
- o For C2M, module RX is CTLE only and host has extended TX FFE. There are two possible structures based on Module TX:
  - 1. Asymmetric structure: module has short TX FFE (e.g. 4 taps with 2 pre). Host has full RX.
  - 2. Symmetric structure: module has extended TX FFE. Host RX does not have long FFE/DFE.

|                           | ModuleTX                    | Module RX                           | HostTX                       | Host RX              |
|---------------------------|-----------------------------|-------------------------------------|------------------------------|----------------------|
| Asymmetric<br>Balanced EQ | Short FFE (e.g. 4 taps)     | CTLE only                           | Extended FFE (e.g. 11-taps)  | Full RX              |
| Symmetric<br>Balanced EQ  | Extended FFE (e.g. 11-taps) | CTLE only                           | Extended FFE (e.g. 11-taps)  | Shorter<br>Equalizer |
| Traditional<br>Structure  | Short FFE (e.g. 4 taps)     | CTLE + FFE/ DFE with 8 post cursors | Regular TX FFE (e.g. 6 taps) | Full RX              |

Equalization Configuration (assuming 2 pre and 8 post cursors for C2M)

# PAM4 SERDES Power Survey

| Reference        | [1] Dickson<br>ISSCC 2017 | [2] Frans<br>JSSC 2017                       | [3] Im<br>ISSCC 2017                   | [4] Upadhyaya<br>ISSCC 2018                                               | [5] Wang<br>ISSCC 2018                                                                    | [6] Depaoli<br>ISSCC 2018                  | [7] Menol<br>ISSCC 2018 |
|------------------|---------------------------|----------------------------------------------|----------------------------------------|---------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|--------------------------------------------|-------------------------|
| Technology       | 14nm                      | 16nm                                         | 16nm                                   | 16nm                                                                      | 16nm                                                                                      | 28nm                                       | 14nm                    |
| Data Rate [Gb/s] | 56                        | 56                                           | 56                                     | 56                                                                        | 63.375                                                                                    | 64                                         | 112                     |
| TX FFE           | 3-tap                     | 3-tap                                        | -                                      | 4-tap                                                                     | 3-tap                                                                                     | 4-tap                                      | 8-tap                   |
| RX EQ            | -                         | CTLE<br>24-tap FFE<br>1-tap DFE<br>ADC based | CTLE<br>10-tap direct-<br>feedback DFE | CTLE<br>14-tap FFE<br>1-tap DFE                                           | CTLE                                                                                      | CTLE                                       | -                       |
| ADC Res (bits)   | TX Only                   | 8                                            | Non-ADC                                | 7<br>3 if FFE/DFE Off                                                     | 6<br>2 for easy channels                                                                  | Non-ADC                                    | TX Only                 |
| TX Power (mw)    | 101                       | 140                                          | -                                      | -                                                                         | 89.7                                                                                      | 135                                        | 264<br>34 for FIR       |
| RX Power (mw)    | -                         | 370<br>DSP Power<br>not included             | 230                                    | -                                                                         | 283.9 for 6b ADC<br>100 for 2b ADC<br>FFE, Deserializer, PLL,<br>CDR are not included     | 180<br>110 if scaled for 56G<br>and 16nm** |                         |
| Total Power (mw) | -                         | 510<br>DSP Power<br>not included             | 350*                                   | 545 (PMA 325,<br>digital 220)<br>360 w/o FFE/DFE<br>(PMA 295, digital 65) | 373.6 for 6b ADC<br>189.7 for 2b ADC<br>(FFE, Deserializer, PLL,<br>CDR are not included) | 315<br>193 if scaled for 56G<br>and 16nm** |                         |

- O Most of the data rates listed are close to 56Gbps. For the same structure, power will be almost double for 112Gbps considering majority of circuit power scales with clock rate/Bandwidth.
- \* [3] total power is around 350mW if assuming a 120mW TX.
- \*\*Assuming 30% power saving from 28nm to 16nm.

## PAM4 SERDES Power Survey Summary

- Different receiver architectures published on ISSCC and JSSC are listed CTLE only, direct feedback DFE, and ADC-based.
- o In average TX power about 120mW for 50G and 240mW for 100G.
- o [5] and [6] shows ADC-based receiver power can be reduced by 185.0mW and 183.9mW by turning off RX FFE/DFE. SERDES power increased about 51% to enable RX FFE/DFE. If scaled to 100G, this difference will be about 370mW. As the same design can be used for both long-reach and short-reach with optimized power, design cost is reduced.
- Can receiver FFE/DFE be turned off for C2M channels?
  - o sun nea 01a 0517 shows TX FIR effectively cancels bad reflections for a 33dB channel.
  - o <u>sun 3ck 01a 0518</u> shows channel output eye is wide open for a 14dB channel with extended TX FIR. No RX FFE/DFE will be needed.
  - o <u>twombly 3ck 01a 0718</u> shows good performance on a 30dB channel by extending TX FIR. Only 3-tap FFE and DFE on the RX side to deal with material loss.

#### PAM4 SERDES Power Survey Summary cont.





Eye after 3 Tap TX FFE

Eye after 19 Tap TX FFE

Eye diagram of 13.2dB channel. Most of the 19 TX FFE taps are zero. [sun 3ck 01a 0518]







TX FIR effectively cancels bad reflections for a 33dB channel. [sun\_nea\_01a\_0517]

C2M or even tougher channel output eye should be wide open with extended TX FIR.

#### 100G C2M SERDES Power - 8 post cursors

| Architecture                                 | Balanced EQ (1. Asymmetric, 2. symmetric)*                                              | 3. Direct Feedback**                                           | 4. ADC Based                                                                                                       |
|----------------------------------------------|-----------------------------------------------------------------------------------------|----------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------|
| Equalization                                 | TX: FIR (2/4 taps for asymmetric structure, 2/11 taps for symmetric structure) RX: CTLE | TX: FIR (2/4)<br>RX: CTLE, 8-tap direct-<br>feedback DFE       | TX: FIR (2/4)<br>RX: CTLE, 6-bit ADC, 8<br>postcursor digital FFE                                                  |
| TX Power (mW)                                | 247 (asymmetric structure) 277 (symmetric structure) (by scaling TX FIR of [7])         | 247 (by scaling TX FIR of [7] to 4 taps)                       | 247<br>(by scaling TX FIR of [7] to 4<br>taps)                                                                     |
| RX Power (mW)                                | 220<br>(by scaling [6] to 112G)                                                         | 460 (by scaling [3] to 112G, 2 DFE tail tap power is very low) | 763 (568 by scaling [5] to 112G; 115 for FFE by scaling FIR of [7] for 6b input; 80 for PLL, deserializer and CDR) |
| Relative total Power (mW)                    | <b>0</b> (467 as Baseline for asymmetric)<br><b>30</b> (497 for symmetric)              | 240<br>(total 707)                                             | 543<br>(total 1010)                                                                                                |
| Power Difference for 800G<br>Module C2M (mW) | 0<br>(Total 3736)                                                                       | <b>1,920</b> (Total <b>5656</b> )                              | <b>4,344</b> (Total <b>8080</b> )                                                                                  |

- o Power of different SERDES structure is derived from the survey results. 8 postcursor taps are assumed.
- The asymmetric structure adds 30mW power on switch (0.96W for 32 ports) to trade for lowest module power. Symmetric The symmetric structure enables close to lowest power RX for both module and host.
- \*\* DFE tap 1 timing is tight. Assuming it can implemented by other power equivalent ways for C2M.
- Total power ratio for architecture 1, 2, 3, and 4 is **1 : 1.06 : 1.51 : 2.16**.

# Module Power Budget

#### 2x400GBase DR4: Gen 1 excluding Electrical I/O

Lowest Max Power (ex. electrical I/O) ~ 9.9 W

Highest Max Power (ex. electrical I/O) ~ 16.8 W



Power Available for Electrical I/O ~ 5.1 W



Power Available for Electrical I/O ~ - 1.8 W

- o <u>welch 3ck adhoc 01 081518</u> analyzed power budget for electrical I/O.
- o Power available for C2M is 5.1W in the best case. In the C2M power table, only "balanced EQ" meets this budget.
- "Balanced EQ" needs extra logic for adaptive turning. If management network is used for this purpose, the extra logic is mainly for register access and its power should be negligible.

#### C2M SERDES Power – 5 post cursors

- Besides implementations in the survey table, FFE can also be implemented in analog domain.
   Circuit distortion is a challenge if too many FFE taps are required.
- Assuming 5 FFE postcursors are enough by tightening channel or relaxing pre-FEC BER target, power ratio of C2M with asymmetric TX FFE, symmetric TX FFE, and analog RX FFE is about 1.00: 1.04: 1.40.
- O TX FIR has 4 or 11 taps depending on whether there is RX FFE. The TX in this survey is different from [7]. Its tail taps are assumed to have less bits than major taps, and TX power is also lower.



# **Conclusions**

- Multiple SERDES architectures are investigated for C2M with 8 or 5 postcursors. Compared to "balanced EQ", power of other architectures is 40% to 116% higher.
- If 8 postcursors are defined for robust performance, 800G module power is 1.9W to 4.3W lower by using "balanced EQ" for C2M.
- With extended TXFIR, ADC-based receiver can turn on low-power mode to save about 370mW per 100G lane. Same design can be used for both long-reach and short-reach to save design cost.
- Symmetric "balanced EQ" structure has similar power as asymmetric structure. Meanwhile allows to host SERDES to enable lower-power RX.

## References

- [1] T. O. Dickson, et al., "A 1.8pJ/b 56Gb/s PAM-4 Transmitter with Fractionally Spaced FFE in 14nm CMOS," ISSCC, pp. 118–119, Feb. 2017.
- [2] Y. Frans, et al., "A 56-Gb/s PAM4 Wireline Transceiver Using a 32-Way Time-Interleaved SAR ADC in 16-nm FinFET," *IEEE JSSC*, vol. 52, no. 4, pp. 1101-1110, Apr. 2017.
- [3] J. Im, et al., "A 40-to-56Gb/s PAM-4 Receiver with 10-Tap Direct Decision-Feedback Equalization in 16nm FinFET", ISSCC, pp. 114–115, Feb. 2017.
- [4] P. Upadhyaya, et al., "A Fully Adaptive 19-to-56Gb/s PAM-4 Wireline Transceiver with a Configurable ADC in 16nm FinFET", ISSCC, pp. 108-109, Feb. 2018.
- [5] L. Wang, et al., "A 64Gb/s PAM-4 Transceiver Utilizing an Adaptive Threshold ADC in 16nm FinFET", ISSCC, pp. 110-111, Feb. 2018.
- [6] E. Depaoli, et al., "A 4.9pJ/b 16-to-64Gb/s PAM-4 VSR Transceiver in 28nm FDSOI CMOS", ISSCC, pp. 112-113, Feb. 2018.
- [7] C. Menolfi, et al., "A 112Gb/s 2.6pJ/b 8-Tap FFE PAM-4 SST TX in 14nm CMOS", ISSCC, pp. 103-104, Feb. 2018.
- [8] http://www.ieee802.org/3/100GEL/public/18\_03/farjadrad\_100GEL\_01a\_0318.pdf
- [9] http://www.ieee802.org/3/ad\_hoc/ngrates/public/17\_05/sun\_nea\_01a\_0517.pdf

Thanks!

# **Backup Slides**

# Long-Reach SERDES Power

- Channel quality impacts equalization and power of KR/CR SERDES.
- <u>farjadrad 100GEL 01a 0318</u> shows 32-tap FFE power is about 400mW. The same number of TX taps const about 100mW because of input is 2 instead of 8 bits. ADC ENOB can also be reduced to save power with balanced EQ. [<u>sun nea 01a 0517</u>]. In this case power ratio for SERDES with heavy RX and "balanced EQ" is more than 1.4:1.