# **Toward Convergence of FEC Interleaving Schemes for 400GE**

Zhongfeng Wang and Phil Sun Broadcom Corp. and Marvell

IEEE P802.3bs, Task force, Sep., 2015



#### **INTRODUCTION**

- This presentation discusses tradeofffs for different FEC interleaving schemes for 400GE.
- It aims to narrow down FEC interleaving options.



### **BASICS OF CODING THEORY**

- It has been known for tens of years that multiple code words interleaving can increase burst error correction capability for RS, BCH, or other kind of FEC codes.
- To the best knowledge of the authors, the code word interleaving technique has not yet been used in Ethernet systems. Why?
  - Linearly increased latency is the major drawback.
  - The technique was used in OTN system(G.709) since interleaving latency is acceptable in that application.
- What does 400GE bring us?
  - Cons: higher cost in HW and higher power consumption.
  - Pros: Higher data rate, much reduced transmission latency. In fact one RS(544, 514) code word only takes 12.8ns to transmit.
- In brief, 400GE has brought us an unprecedented advantage in FEC coding that the latency penalty of multiple (2 ~ 4) code interleaving is not significant.

# LATENCY COMAPRISON OF VARIOUS OPTIONS [1]

• Latency for interleaving schemes with PMA Bit MUXing

| Schemes<br>FEC | <b>1,2,3</b><br>No pre-interleave | <b>6</b><br>4-way Interleaving | <b>7</b><br>Fom | <b>8</b><br>2-way Interleaving |
|----------------|-----------------------------------|--------------------------------|-----------------|--------------------------------|
| 1x400G         | 75ns                              | 150ns                          | -               | 99ns                           |
| 2x200G         | 87ns                              | 138ns                          | _               | 87ns                           |
| 4x100G         | 113ns                             | 113ns                          | 113ns           | 113ns                          |

 From the above table, it can be seen that the latency penalty for 2code interleaving (over non-interleave case) is 12ns.

4

- The latency penalty for 4-code interleaving is 38ns.
- The difference between HW complexity is not significant [1].

## **PERFORMANCE COMPARISON OF VARIOUS OPTIONS [2]**



- From the above figure, it can be seen that the performance gain of 2-code interleaving is about 1.6 dB for target BER=1e-13 in the simulated case.
- The performance gain from 4-code interleaving is about 1.8 dB.

## **PERFORMANCE COMPARISON OF VARIOUS OPTIONS [2]**

|                                    | At slicer output |               | At FEC input  |                      |
|------------------------------------|------------------|---------------|---------------|----------------------|
|                                    | FLR = 6.2E-11    | FLR = 6.2E-13 | FLR = 6.2E-11 | FLR = 6.2E-13        |
| No FEC                             | 1E-13            | 1E-15         | 1E-13         | 1 <mark>E-</mark> 15 |
| Same cwd (1), a = 0.75             |                  |               | 7.6E-6*       | 1.6E-7*              |
| Same cwd, symb inter (2), a = 0.75 |                  |               | 2E-5*         | 4.9E-7*              |
| Same cwd (1), a = 0.5              |                  |               | 9E-5*         | 3.9E-5*              |
| 1:4 Pre-interleaved (4), a=0.75    |                  |               | 1.1E-4*       | 5.5E-5*              |
| 1:2 Pre-interleaved (8), a=0.75    |                  |               | 1.8E-4*       | 8.6E-5*              |
| Diff cwd (FOM) (7), a = 0.75       |                  |               | 1.9E-4*       | 1E-4*                |
| Same cwd precoded, a=0.75          | 2.3E-4*          | 1.3E-4*       | 1.1E-4        | 6.3E-5               |
| 1:4 Pre-interleaved (6), a=0.75    |                  |               | 2.5E-4*       | 1.5E-4*              |
| 1:2 Pre-int, sym mux (10), a=0.75  |                  |               | 3.5E-4*       | 1.9E-4*              |
| 1:4 Pre-int, sym mux (9), a=0.75   |                  |               | 4.2E-4*       | 2.6E-4*              |
| Random errors                      |                  |               | 3.2E-4        | 2.3E-4               |

- To achieve 1e-13 BER target for a PAM4 link, FEC input BER for scheme 8 (2-code interleaving) can be orders higher than scheme 1.
- FEC input BER difference for scheme 8 (2-code interleaving) and 6 (4-code interleaving) is less than 2 times.

#### **ANALYSES**

- From the previous comparison on latency and performance, we may want to narrow down our selection to options 8 and 6.
- On the other hand, since both schemes used bit-muxing and code distribution over all lanes, we have cleared other implementation concerns such as easy optical module and occurrence of one bad channel.



# **VARIOUS DATA STRIPING METHODS**



- In the above, Case-I shows bit-muxing scheme. Case-III shows RS symbol-muxing.
- The Case-II is based on the 8-lane-stripe idea [3] with data alignment in the middle. Data alignment is to ensure RS symbol interleaving over 8 lanes.
- Roughly speaking, the performance increases from Case-I to III while Case-II and III brings more design complexity.

### **OPTION-A FOR STRIPING DATA OVER 8 LANES**

- In this presentation, data alignment is assumed for Case II.
- Without data alignment in the middle, symbol interleaving is not guaranteed over 8 lanes.



## **OPTION-B FOR STRIPING DATA OVER 8 LANES**

- Pre-bit-interleaving is used.
- Data alignment is needed in the middle. Otherwise, RS symbol interleaving is not guaranteed over 8 lanes.



# **PERFORMANCE ESTIMATION**



- Assume 2-code interleaving:
  - The performance gap between case-I and case-III is less than 0.4dB [2].
  - And the performance gap between case-I and case-II is even smaller.
- Assume 4-code interleaving:
  - The gap between case-I and case-II (or case-III) is smaller than 0.3dB [2].
- The performance with 2-code interleaving using bit-muxing may be sufficient.

## DATA FLOW OF 2-WAY INTERLEAVED FEC CODING



- A is to stripe data into 2 RS frames. Alignment marker mapping may be simpler if DEMUX block size is multiple of RS FEC symbol size.
- B is to symbol pre-interleave encoded FEC frames.

#### **FINAL REMARK**

 Based on previous analyses and existing simulation results, we propose to narrow down our FEC code interleaving selections to option #8 (2-way interleaving), and #6 (4-way interleaving) if needed for performance.



### **APPENDIX: CODEWORD INTERLEAVING**

- For 2-way (or 4-way) code interleaving, using 2x200G (or 4x100G) FEC has shorter latency and avoids extra memories compared to using 1x400G FEC.
- In 2x200G case, 12ns of latency and about three 5k bits of memory buffer may be saved.

