

## **Requirements for 100Gb/s FEC**

Sudeep Bhoja, Broadcom Corp.

www.broadcom.com





1

- John Wang, Broadcom Corp.
- Will Bliss, Broadcom Corp.

## Outline



- Review of 10G-KR FEC
- Impact of 100GE MLD on latency and burst errors
- DFE burst errors

#### Framework and potential solutions

- Lower latency codes with higher coding gain (2 error correcting BCH)
- Precoding
- 28Gb/s with 7% overhead FEC

#### Summary

## **10G-KR FEC Summary**



#### • 10G-KR QC(2112, 2080) FEC

- DFE's used in 10G-KR can create burst errors and create Mean Time To False Packet Acceptance (MTTFPA) issues
- FEC Corrects 1 random error and 11 bit burst error for a block of 2112 bits
  - 32 bit overhead for 32 PCS frames. Each PCS frame is 66 bits.
- Random errors
  - Provides SNR gain of ~2.5dB
  - Input BER of 1E-12, Output BER of 1E-21
- DFE burst errors
  - Backplane traces can generate post-cursor ISI that results in the 1<sup>st</sup> DFE tap in the 0.5 to 1 range
  - For a  $(1+z^{-1})$  channel with DFE tap = 1,
    - FEC provides ~3 orders of magnitude reduction in BER
    - Input BER of 1E-12, Output BER of 1E-15
    - Provides ~1dB SNR gain
  - For a 1 tap DFE with tap coefficient 0.5, provides SNR gain of 1.9dB
    - FEC provides ~6 orders of magnitude reduction in BER

## **10G-KR FEC Latency**



## QC(2112,2080) FEC adds >250ns of latency

- At the Tx, the 32 parity bits are added at the end of the block, negligible latency
- At the Rx, the syndrome computation requires 1 block latency (2112) bits
- Standard Meggit decoder requires another block latency
  - M Parallel implementations can reduce latency to ~(2112 + 32 + 2112/M) bits
  - For highly parallel implementation this latency can be made very small
- Error marking for the 66 bit blocks can require an additional 1 block of latency
- Typically overall latency is 250ns 410ns
  - Max. per clause 74.6 is 614.4 ns sum of transmit and receive delay

#### • Please see 10G FEC tutorial for additional details

http://www.ieee802.org/802\_tutorials/06-July/10GBASE-KR\_FEC\_Tutorial\_1407.pdf

## Impact of 100GE MLD PCS on FEC



### • 100GE MLD PCS is 20 virtual lanes. Assume reuse of 10G-KR FEC

- FEC is per virtual lane (i.e. 5G virtual line rate)
- Latency is 20 times as many bits, double the time of 10GBASE-KR since link is faster
  - ~500ns 820ns latency
- For random errors input BER of 1E-12, output BER of 5.3E-21
  - Very small drop in performance because of larger block size, 2.45dB coding gain
- Burst error tolerance improves
  - Each physical lane is de-interleaved by 5x, so total burst tolerance length is increased to 55 from 11 bits. This is an overkill since most KR FECs only require 11 bit tolerance
  - For a (1+z<sup>-1</sup>) channel with DFE tap = 1, the burst error tolerance is better than random errors

Input BER of 1E-12, Output BER of 5E-29

 Summary: Reuse of 10G KR FEC in 100GE doubles latency but improves MTTFPA because of DFE error propagation on 25.8Gb/s PHY

# 2 error correcting BCH improves performance & reduces latency



## BCH(1452, 1430, 5) can correct any random 2 bit errors

- Encoding is on 22 x 65 bit PCS block
- Latency is 1452/2112 = 0.6875 of KR
  - 31.25% reduction in latency
  - Latency is 344ns 563ns
    - 100G-KR Fire code latency is 500-820ns latency
  - Still >10G-KR latency of ~250-410ns
- For random errors input BER of 1E-12, output BER of 5E-28
  - 3.81dB coding gain
- DFE Burst Error
  - BCH with 100G 5x de-interleaver
    - Can correct one 10-bit burst errors
    - Or two 5-bit bursts ...
  - Provides ~3 orders of magnitude improvement in BER

#### **Comparison of Codes for random errors**





 2 error correcting BCH provides more coding gain than 10G-KR Fire code

## **Required burst error tolerance**

#### DFE's are well known to propagate errors in its feedback loop

A single error will become a burst error

#### • Consider NRZ 1 tap DFE with tap coeff = 1

- If previous decision is wrong, then there is ½ probability of making a successive decision error
- i.e. Probability of K consecutive errors =  $(\frac{1}{2})^k$
- If DFE Input error rate = 1E-10, prob of 10 bit DFE error burst is ~1E-13

#### For PAM4 case on same channel model

- Probability of K consecutive errors =  $\frac{1}{4} * (3/4)^{(k-1)}$
- DFE input error rate = 1E-10, prob of 10 bit (5 symbol) DFE error burst ~ 7.9E-12
- PAM4 error propagation is more severe

#### Is this acceptable? Depends on channel model

• A study group topic



 The burst error length of the DFE error events for both PAM4 and NRZ can be reduced by using precoding

#### NRZ Tx precoding uses a 1/(1+D) mod 2

- Identical to a duo-binary precoder
- Rx uses a (1+D) mod 2 after slicing

#### • PAM4 Tx precoding uses a 1/(1+D) mod 4

- Multilevel version of the duo-binary precoder
- Rx uses a (1+D) mod 4 after slicing
- Reduces practical burst error runs into a maximum of 2 errors
  - One error at the entry, one error at the exit
- BCH(1452,1430,5) can then address a single error event up to 1452\*5 bits long for NRZ or 1452\*5/2 bits long for PAM4
- Precoders have been well researched and previously proposed in IEEE 802.3 but 100G-KR challenges make them attractive now!

## 2 error correcting BCH with Precoding NRZ vs. PAM4



#### Burst error performance for 100GE with 5 virtual lanes

- With precoding and DFE tap = 1, a single error in a block can be corrected
- NRZ: For input error rate p = 1e-12, the BCH output BER is 3.63E-21
- PAM4: For input error rate p = 1e-12, the BCH output BER is 9.1E-22
  - Assumes gray coded PAM4
- PAM4 and NRZ have approx 2-2.5dB coding gain
- A single random error results in 2 errors because of precoding
  - The random error performance is similar to burst error performance
- Precoding and 2 error correcting BCH provides better performance with 30% lower latency compared to 10G-KR fire code

## 28Gb/s vs. 25.78Gb/s Serdes





Increasing the rate to 28Gb/s (OIF-VSR-28G) provides for 7% FEC overhead

- ITU-T RS(255,239) on GF(256) provides 6.36dB coding gain for 2040 block size

- Much better performing FEC codes are possible with less latency
  - Coding gain can be used to address legacy backplanes

## **Options to reduce latency**

## BROADCOM.

#### • Currently the 100G FEC is placed on a virtual lane boundary

- 2 error correcting BCH has latency of 344ns - 563ns

#### The FEC can be placed after each physical lane i.e. operates on 5 virtual lanes of data

- Decreases latency by 5x
- Needs to find the 64/66 boundaries.
- FEC is running at 25G. Latency ~100ns

#### • The FEC can also be aggregated across physical lanes

- Further reduces latency by 4x
- Currently implemented in 10GBASE-T
- Latency ~25ns

#### • Please see wang\_01\_0111 presentation for additional details

#### Summary



- Reuse of 100G MLD PCS and 10G-KR FEC results in >0.5us latency which may not be acceptable for latency critical applications
- Precoding and shorter BCH codes can reduce latency by 30% without impacting performance for both NRZ and PAM4
- 28Gb/s line rate with 7% overhead FEC can provide much higher coding gain with smaller latency
- Coding across physical lanes can reduce latency to ~25ns and provide >6dB coding gain
- Coding layer can optionally be auto-negotiated to optimize latency based on application needs