# 448G/Lane Modulation & FEC

Halil CIRIT, Meta Sanjeev GUPTA, Meta

May-29, 2025



### OUTLINE



- 448G/Lane Simulations
  - Channel Models
  - Simulation Results with PAM
  - Bidirectional Signaling
- New FEC Proposal
  - 2D RS FEC Results (Under Burst Error)
  - Complexity and Latency Analysis
- Conclusion (Summary and Future Work)

## 448G/Lane Channel Model

#### Backplane Backplane Connector Connector **CPC Connector CPC Connector** 30 AWG, Die (SERDES) 30 AWG Die (SERDES) 300 mm Substrate 300 mm 00000 Interposer cable cable 26 AWG, 1 m PCB PCB cable Backplane Cable Cartridge for Scale Up Applications **MTIA Package Substrate** Package Substrate (~35 mm) **IEEE** Channels + Interposer + Interposer shah e4ai 02 250430 25dB RO 85G XT 50dB shah e4ai 02 250430 25dB RO 85G XT 55dB . shah e4ai 02 250430 25dB RO 85G XT 60dB • shah e4ai 02 250430 30dB RO 85G XT 50dB • Limiting Factors: shah e4ai 02 250430 30dB RO 85G XT 55dB . shah e4ai 02 250430 30dB RO 85G XT 60dB . Channel Insertion loss shah e4ai 02 250430 35dB RO 85G XT 50dB shah e4ai 02 250430 35dB RO 85G XT 55dB Resonance/Roll-off within pass band shah e4ai 02 250430 35dB RO 85G XT 60dB High crosstalk (NEXT and FEXT) tracy efai 250430 DAC

We acknowledge IEEE and connector vendor for kindly providing the channel models: https://www.ieee802.org/3/ad\_hoc/E4AI/public/channel/index.html

#### **S-Parameter Plots**

# Meta



Ask from Industry:

- Provide SERDES Interposer + Package models
- Find ways to reduce crosstalk and eliminate resonance around Nyquist frequency



### shah\_e4ai\_02\_250430\_25dB\_RO\_85G\_XT\_55dB + Package Meta + Interposer



|                     | PAM-6  | PAM-6  | PAM-8  | PAM-8  |
|---------------------|--------|--------|--------|--------|
| Component BWs (GHz) | 90     | 100    | 75     | 100    |
| Slicer SNR          | 23.7   | 23.9   | 26.3   | 27.0   |
| SNR Margin (@1e-3)  | 3.6    | 3.8    | 3.8    | 4.5    |
| DFE SER             | 1.3e-4 | 8.1e-5 | 1.5e-5 | 3.3e-6 |
| MLSE SER            | 3.0e-6 | 1.8e-6 | 3.6e-6 | 1.3e-6 |

**Performance Matrix** 



#### **Performance Matrix**

|                     | PAM-6  | PAM-6  | PAM-8   | PAM-8    |
|---------------------|--------|--------|---------|----------|
| Component BWs (GHz) | 90     | 100    | 75      | 100      |
| Slicer SNR          | 23.9   | 24.1   | 26.5    | 27.4     |
| SNR Margin (@1e-3)  | 3.8    | 4.0    | 4.0     | 4.9      |
| DFE SER             | 1.1e-4 | 7.1e-5 | 1.08e-5 | < 1.0e-6 |
| MLSE SER            | 1.8e-6 | 2.3e-6 | 2.20e-6 | < 1.0e-6 |





|                     | PAM-6  | PAM-6  | PAM-8  | PAM-8  |
|---------------------|--------|--------|--------|--------|
| Component BWs (GHz) | 90     | 100    | 75     | 100    |
| Slicer SNR          | 21.4   | 21.7   | 24.9   | 25.3   |
| SNR Margin (@1e-3)  | 1.3    | 1.6    | 2.4    | 2.8    |
| DFE SER             | 2.2e-3 | 1.7e-3 | 4.7e-4 | 1.8e-4 |
| MLSE SER            | 9.6e-5 | 9.0e-5 | 1.6e-5 | 1.1e-5 |

**Performance Matrix** 



|                     | PAM-6  | PAM-6  | PAM-8  | PAM-8  |
|---------------------|--------|--------|--------|--------|
| Component BWs (GHz) | 90     | 100    | 75     | 100    |
| Slicer SNR          | 21.9   | 22.3   | 25.1   | 25.7   |
| SNR Margin (@1e-3)  | 1.8    | 2.2    | 2.6    | 3.2    |
| DFE SER             | 1.5e-3 | 8.6e-4 | 3.3e-4 | 4.8e-5 |
| MLSE SER            | 5.7e-5 | 3.0e-5 | 1.4e-5 | 7.6e-6 |

**Performance Matrix** 

# Channel: tracy\_efai\_250430\_DAC + Package + Interposer





#### Channel Model Response

Performance Matrix

|                     | PAM-6  | PAM-6  | PAM-8  | PAM-8  |
|---------------------|--------|--------|--------|--------|
| Component BWs (GHz) | 90     | 100    | 75     | 100    |
| Slicer SNR          | 22.1   | 22.2   | 25.1   | 25.8   |
| SNR Margin (@1e-3)  | 2.0    | 2.1    | 2.6    | 3.3    |
| DFE SER             | 1.1e-3 | 9.6e-4 | 2.8e-4 | 3.1e-5 |
| MLSE SER            | 2.7e-5 | 2.3e-5 | 1.4e-5 | 8.7e-6 |

### **Bidirectional Signaling (2x224G PAM-4)**

# Meta



• Due to high insertion loss, TX noises limits the performance, unlike RX-limited unidirectional systems

|     |                          | Pessimistic | Expected | Optimistic |
|-----|--------------------------|-------------|----------|------------|
|     | Cancellation Level (dB)  | 10          | 20       | 30         |
|     | Bandwidth Mismatch (GHz) | 10          | 5        | 3          |
|     | SNR Correlation Coeff    | 0.5         | 0.7      | 0.9        |
|     | Jitter Correlation Coeff | 0.5         | 0.7      | 0.9        |
|     | ~ Slicer SNR Margin (dB) | -6.4        | -3.3     | 2.1        |
| •   | ~ Slicer SNR Margin (dB) | -0.5        | 1.6      | 5.0        |
| 85G | ~ Slicer SNR Margin (dB) | -1.2        | 1.0      | 4.6        |

224G Channel

tracy\_efai\_250430\_DAC + Package

shah\_e4ai\_02\_250430\_25dB\_RO\_85 XT 60dB+ Package

- 2D Codes are both simple and powerful, especially for constructing very long block codes from smaller, more manageable components
- Iterative Decoding
  - Each bit/symbol is encoded by two component codewords
  - Decode rows and columns alternatingly
- Example:
  - Assume a two-error correcting component code
  - A lot more than 2 errors in a row/ column can be corrected by iterative decoding
- Minimum undecodable pattern  $(t_1+1)x(t_2+1)$





## Meta

- Iterative Decoding
  - Each bit/symbol is encoded by two component codewords
  - Decode rows and columns alternatingly
- Example:
  - Assume a two-error correcting component code
  - A lot more than 2 errors in a row/ column can be corrected by iterative decoding
- Minimum undecodable pattern  $(t_1+1)x(t_2+1)$



1st iteration  $\rightarrow$ 



# Meta

- Iterative Decoding
  - Each bit/symbol is encoded by two component codewords
  - Decode rows and columns alternatingly
- Example:
  - Assume a two-error correcting component code
  - A lot more than 2 errors in a row/ column can be corrected by iterative decoding
- Minimum undecodable pattern  $(t_1+1)x(t_2+1)$



2nd iteration



# Meta

- Iterative Decoding
  - Each bit/symbol is encoded by two component codewords
  - Decode rows and columns alternatingly
- Example:
  - Assume a two-error correcting component code
  - A lot more than 2 errors in a row/ column can be corrected by iterative decoding
- Minimum undecodable pattern  $(t_1+1)x(t_2+1)$



3rd iteration  $\rightarrow$ 



# Meta

- 2D RS FEC : RS Component Codes with t=2 or t=3
- >1.5 dB gain with RS9(69,65)
  >2 dB gain with RS9(102,96)
  with respect to
  RS10(544,514)+Ham.(128,120)

(KP4+Hamming)



### **Complexity and Latency Analysis**

# Meta

#### Total Complexity and Latency of some 2D-RS codes at 1.6T PCS wrt KP4

|                                                   | RS9(6            | 9,65)          | RS9(102,96) | RS10(544,514) –     |
|---------------------------------------------------|------------------|----------------|-------------|---------------------|
|                                                   | Lower Complexity | Lower Latency  |             | KP4                 |
|                                                   | (8 iterations)   | (6 iterations) |             | (+Hamming)          |
|                                                   |                  |                |             | %5.84               |
| Overhead (OH)                                     | %12              | .68            | %12.89      | (%12.89)            |
| Total LUT Utilization(FPGA)                       | 100 K            | 181K           | 171K        | 171K <sup>(*)</sup> |
| Total Latency(ns)                                 | 339.8            | 201            | 673.3       | 83.2 <sup>(*)</sup> |
| SNR Gain wrt (KP4+Hamming) @10 <sup>-15</sup> BER | 1.8              | 1.7            | 2.3         | 0                   |

(\*)Additional complexity & latency due to Hamming Code

- High latency due to iterations can be decreased with further optimizations such as:
  - Increase the complexity to a level equivalent to current KP4 for 2D-RS codes
  - Incremental computation of syndromes for subsequent iterations
  - Number of iterations can be decreased with marginal performance penalty

#### Conclusion

- With the current package and channel models PAM6, PAM8 or bidirectional are feasible options
  - Bidirectional need more analog models for hybrid subtractor
- Need more improvement on channel and xtalks
  - Need more package models
- FEC with more coding gain than KP4(+Hamming) => 2D RS Codes
  - Minimal additional latency: > 1.5 dB coding gain
  - Same complexity: >2 dB coding gain
- Working on alternative decoding strategies for 2D RS codes to lower complexity

# **Thank You!**



# **Reserved Slides**



#### Channels



### **Communication Topologies**



#### TX RX RX TX TX

**Meta** 

#### Uni-Directional Transmission

- Higher Bandwidth
  - Higher Insertion Loss
  - Increased Crosstalk Interference
- Higher Modulation Order
  - Increased SNR Penalty
- One-Way Communication
  - Simple RX Design
  - More Signal Power

#### Simultaneous Bi-Directional Transmission

- Same Bandwidth
  - No Extra Loss
  - No Extra Crosstalk Interference
- Same Modulation Order
  - Same SNR Requirements
- Two-Way Communication
- Complex RX Design with additional components
  - Very high interference

## **Channel Model and Additional Components**

# Meta

#### Major Differences over Uni-Directional Channel

- Overlap of Desired Inbound Signal and Very Strong Outbound Signal: Figure

   (a)
  - Solution: Hybrid Circuitry for cancellation by subtraction of overlapped signal: Figure (b)
  - To utilize dynamic range of ADC better
  - Requires a Replica Driver
- Echoes of Outbound Signal: Figure (c)
  - Analog Echo Cancellation (AEC)
    - Strong Near-End Echo ( $\propto S_{11}$  of Channel)
  - Digital Echo Cancellation (DEC)
    - Weak Far-End Echo ( $\propto S_{12} \& S_{21}$  of Channel)



### Simple Block Diagram of SBT (Single Side)



### **TX - Replica Imperfections**

