# Interleaver Design for Concatenated Code with the (144,136) Code

Hao Ren, Xiang He

**Huawei Technologies** 

# Contributors:

Matt Brown, Huawei

### Background

- RS(544,514) has been adopted for 200G/lane AUIs (C2C and C2M).
  - See <u>dambrosia 3dj 01a 230116.pdf</u> and <u>motions 3dfdj 230117.pdf</u>.
- Concatenated code with 4x interleaved RS(544,514) as the outer code is under discussion.
  - bliss 3df 01b 2211, farhood 3df 02b 2211 both proposed BCH/Hamming inner codes with RS outer code.
- Interleaver between outer and inner code can randomize the errors from inner code decoders, improving overall coding gain, as analyzed in <u>bliss 3df 01a 220517</u>.
  - Convolutional interleaver is usually used for block codes to minimize latency for relatively high interleaving depth.
  - Convolutional interleaver with depth of 12 RS codewords was proposed in <u>farhood 3df 02b 2211</u> for Hamming(128,120).
- A convolutional interleaver for binary code (144,136) is proposed in this contribution.
  - Questions were raised during Bangkok meeting on how to design the convolutional interleaver on this code.
  - Effective interleaver depth is over 12 RS codewords, with latency of 76.8ns (800 GbE).

## Things to be Considered when Designing Interleaver

#### Interleaving depth and performance

- Hamming(128,120) uses a convolutional interleaver based on number of RS-symbols in an inner code.
- Convolutional interleaver for (144,136) can work on blocks longer than RS-symbols.
- Both codes can have high interleaving depth, enough to randomize error distribution from inner code.

#### Supports 200/400/800/1600 GbE

- All Ethernet rates that could utilize 200G/lane PMDs should be supported.
- Interleaver design based on the common part across all rates can simplify implementation and specification.

#### Breakout support

Minimize the logic required to support breakout.

| Ethernet<br>Rate (GbE) | PMD Lane<br>Rate (in .3dj) | Number of RS Codewords |
|------------------------|----------------------------|------------------------|
| 200                    |                            | 2 (or 4?)              |
| 400                    | 0000//                     | 2 (or 4?)              |
| 800                    | 200G/lane                  | 4                      |
| 1600 <i>(TBD)</i>      |                            | 4                      |

#### Issues for (128,120) Interleaver: Designed over 25Gb/s PCS Lanes

- Interleaver and encoder per PCS lane design
  - Both interleaver and encoder are performed based on 25Gb/s PCS lanes, which is not forward-looking.
  - 1.6 TbE does not have any reason to use 25Gb/s PCS lanes. 100Gb/s PCS lane is more reasonable. (gustlin 3dj 01b 230206)
  - Padding is proposed to have integer PLL design. (<u>farhood 3dj 01a 230206</u>)
- Redesign is required to support potential 100Gb/s PCS lanes.
  - AM locking over 100G/lane PCS is different from 25G/lane.
  - Convolutional interleaver requires redesign for 100G/lane with different delay parameters.



#### Breakout Support of Inner Code Could Work at Per Lambda

- Each 200G/lane PMA/PMD in the module has its own inner code encoder(s)/decoder(s)/interleaver(s).
  - Advantage: Naturally supports breakout as no regrouping/distribution is required over multiple lambdas.
  - Works for both 100G/lane and 200G/lane AUIs, supporting  $2x100G \leftarrow 2x100G$ ,  $2x100G \leftarrow 1x200G$  and  $1x200G \leftarrow 1x200G$  interop.



#### Convolutional Interleaver Design for Binary (144,136)



- Convolutional interleaver is a general interleaving method that could support any block codes.
  - Different code may have different numbers for N, D and branches.
- A 4-branch convolutional interleaver is proposed for (144,136) code.
  - Round-robin distribution based on D = 34b blocks. N = 2720/34/4 = 20.
  - For each group of 4 codewords in PCS, each convolutional interleaver gets 4\*5440/8 = 2720b.

#### Convolutional Interleaver Design for Binary (144,136), continued



- D = 34b, each column of 4 blocks form an inner codeword.
  - If PCS has only 2 codewords, D could be 17 bits and number of branches is increased to 8.
  - Worst case of tailing bits in each D block can still guarantee an equivalent interleaving depth of more than 12 RS codewords.
- Synchronization of inner code can guarantee successful de-interleave.
  - Inner code synchronization can use self-sync methodology similar as in Clause 74.
  - Does not rely on AM from PCS or additional AM inside modules, simplifies logic inside module.

### Performance Analysis

- Latency can be evaluated based on number of RS codewords.
  - We recommend to bypass the interleaver for low-latency applications.

| Ethernet<br>Rate (GbE) | # of RS CWs<br>in PCS | Interleaver<br>Throughput | # of RS CWs<br>Interleaved | Interleaver<br>Latency, ns | SNR, dB | Pre-FEC<br>BER |
|------------------------|-----------------------|---------------------------|----------------------------|----------------------------|---------|----------------|
| 200                    | 2                     | 200G                      | 16                         | 358.4                      | 14.96   | 4.6E-3*        |
| 200                    | 4                     |                           | 16                         | 307.2                      |         |                |
| 400                    | 2                     |                           | 16                         | 179.2                      |         |                |
| 400                    | 4                     |                           | 16                         | 153.6                      |         |                |
| 800                    | 4                     |                           | 16                         | 76.8                       |         |                |
| 1600 <i>(TBD)</i>      | 4                     |                           | 16                         | 38.4                       |         |                |

\* Using **<u>sub-optimal</u>** soft-decoding method for faster simulation.



### Performance Analysis

| Code                | Pre-FEC BER | SNR, dB | Code Rate R<br>(relative to 64B/66B) | NCG Penalty<br>10log(R), dB |
|---------------------|-------------|---------|--------------------------------------|-----------------------------|
| (144,136)           | 4.6E-3      | 14.96   | 103.125/112.5                        | -0.378                      |
| (128,120) + padding | 4.8E-3*     | 14.91   | 103.125/113.4375                     | -0.414                      |

<sup>\*</sup>From farhood\_3df\_02b\_2211

NCG difference = 
$$(14.96 - 14.91) - [-0.378 - (-0.414)] = 0.014 dB$$

- Due to 1% more overhead, the NCG of "(128,120)+padding" is only 0.014 dB higher than (144,136).
  - 1% higher overhead leads to performance degradation of optical transceivers, as raised in welch 3df 01a 221011.
  - Considering the bandwidth limitation, actual performance needs to be analyzed between 225 Gb/s and 226.875 Gb/s for (144,136)
    and (128,120)+ padding, respectively.
- 1% Higher data rate also leads to higher power (optical, AD/DA, etc).
  - Potentially impact future CPO and NPO applications where inner code could be integrated in ASIC.
  - Additional optical transceiver power due to higher overhead could be significantly more than the inner FEC decoder power.
  - It is more economic if we allocate this additional power to boost the soft-decoding gain.
    - Using <u>more optimized</u> soft-decoding method will <u>increase the over all coding gain by more than 0.014dB</u>.

### Future Integration of Concatenated Code Considerations

- The (128,120) code will result in more complex design.
  - (128,120) has a factor of 3 in the divisor that requires a frac-N PLL.
    - For an oDSP for 4 or 8 lanes, it is not a big deal.
    - For highly integrated ASIC (e.g. 512-lane digital switching chip) it will complicate things.
      - Dividing reference clock (156.25MHz) by 3 will cause worse jitter.
    - Combined with higher power due to 1% higher overhead, it can be problematic for CPO or NPO.
- The (144,136) code enables integer PLL design and has lower power.

#### Conclusions

- A convolutional interleaver for binary code (144,136) is proposed.
  - It does not rely on 25G/lane PCS lanes.
  - It does not rely on additional alignment method to de-interleave.
  - It supports breakout.
  - The overall performance is on par comparing with Hamming(128,120) + padding.
  - The overall power is lower than (128,120) + padding, due to simpler design and lower data rate.
- We propose to adopt binary code (144,136) as the inner code for concatenated code for 200G/lane optical PMDs.
  - The code supports integer PLL without additional padding.
  - The code is friendly to implementation to both oDSP (for pluggables) and host ASIC (for CPO).

# Thank you