# 512B/513B Transcoding and FEC for 100 Gb/s Backplane and Copper Links

IEEE 802.3 100 Gb/s Backplane and Copper Cable Study Group

Chicago, September 13-14, 2011

Roy Cideciyan - IBM

### Motivation for 512B/513B transcoding

- Transcoding (TC) is performed prior to forward error correction (FEC). Transcoding compresses data and therefore reduces total overhead.
- 100 Gb/s backplane and copper cable task force is considering transcoding from 64B/66B encoded data into 64B/65B encoded data by dropping a header bit
- 64B/65B transcoding has

(65/64) / (513/512) - 1 = 1.36%

higher line rate than 512B/513B transcoding with TC Rate = 512/513

Main motivation for 512B/513B transcoding in conjunction with (N,K) Reed-Solomon (RS) coding with m-bit symbols is lower overclocking than 64B/65B transcoding

Overclocking = (64/66) / (TC Rate \* K/N) - 1

- 512B/513B transcoding results in reduced channel loss
- Lower power consumption

### 64B/66B block formats in 100GBASE-R

- 64B/66B coding used in 100GBASE-R (IEEE 802.3ba-2010, Clause 82)
  - 1 type of data block with 2-bit header 01
  - 11 types of control blocks (CB) with 2-bit header 10 where the 8-bit block type field indicates the type of control block format

|    | Input Data                                                                                                              | S y n c | Block               | Payload        |                |                |                |                |                |                |    |                |                |
|----|-------------------------------------------------------------------------------------------------------------------------|---------|---------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----|----------------|----------------|
|    | Bit Position:<br>Data Block Format:                                                                                     | 01      | 2                   |                |                |                |                |                |                |                |    |                | 65             |
|    | D <sub>0</sub> D <sub>1</sub> D <sub>2</sub> D <sub>3</sub> D <sub>4</sub> D <sub>5</sub> D <sub>6</sub> D <sub>7</sub> | 01      | Do                  | D <sub>1</sub> | D <sub>2</sub> | D <sub>3</sub> |                | D <sub>4</sub> |                | D <sub>5</sub> | Γ  | D <sub>6</sub> | D <sub>7</sub> |
|    | Control Block Formats:                                                                                                  |         | Block Type<br>Field |                |                |                |                |                |                |                |    |                |                |
| 1  | C <sub>0</sub> C <sub>1</sub> C <sub>2</sub> C <sub>3</sub> C <sub>4</sub> C <sub>5</sub> C <sub>6</sub> C <sub>7</sub> | 10      | 0x1E                | Co             | C <sub>1</sub> | C <sub>2</sub> | C3             | C4             |                | C5             |    | C <sub>6</sub> | C7             |
| 2  | $S_0 D_1 D_2 D_3 D_4 D_5 D_6 D_7$                                                                                       | 10      | 0x78                | D <sub>1</sub> | D <sub>2</sub> | D <sub>3</sub> |                | D <sub>4</sub> |                | D <sub>5</sub> |    | D <sub>6</sub> | D <sub>7</sub> |
| 3  | 0 <sub>0</sub> D <sub>1</sub> D <sub>2</sub> D <sub>3</sub> Z <sub>4</sub> Z <sub>5</sub> Z <sub>6</sub> Z <sub>7</sub> | 10      | 0x4B                | D <sub>1</sub> | D <sub>2</sub> | D <sub>3</sub> | 00             |                |                | 0×0            | 00 | _0000          |                |
| 4  | ${\rm T_0C_1C_2C_3C_4C_5C_6C_7}$                                                                                        | 10      | 0x87                |                | C <sub>1</sub> | C <sub>2</sub> | C <sub>3</sub> | C4             |                | Cs             |    | C <sub>6</sub> | C <sub>7</sub> |
| 5  | D <sub>0</sub> T <sub>1</sub> C <sub>2</sub> C <sub>3</sub> C <sub>4</sub> C <sub>5</sub> C <sub>6</sub> C <sub>7</sub> | 10      | 0x99                | Do             |                | C <sub>2</sub> | C3             | C <sub>4</sub> |                | C <sub>5</sub> |    | C <sub>6</sub> | C7             |
| 6  | D <sub>0</sub> D <sub>1</sub> T <sub>2</sub> C <sub>3</sub> C <sub>4</sub> C <sub>5</sub> C <sub>6</sub> C <sub>7</sub> | 10      | 0xAA                | Do             | D1             |                | C3             | C4             |                | C <sub>5</sub> |    | C <sub>6</sub> | C <sub>7</sub> |
| 7  | D <sub>0</sub> D <sub>1</sub> D <sub>2</sub> T <sub>3</sub> C <sub>4</sub> C <sub>5</sub> C <sub>6</sub> C <sub>7</sub> | 10      | 0xB4                | Do             | D <sub>1</sub> | D <sub>2</sub> |                | C.             |                | C4 C5          |    | C <sub>6</sub> | C7             |
| 8  | $D_0 D_1 D_2 D_3 T_4 C_5 C_6 C_7$                                                                                       | 10      | 0xCC                | Do             | D1             | D <sub>2</sub> | 1              | D <sub>3</sub> |                | C <sub>5</sub> |    | C <sub>6</sub> | C7             |
| 9  | $D_0 D_1 D_2 D_3 D_4 T_5 C_6 C_7$                                                                                       | 10      | 0xD2                | Do             | D <sub>1</sub> | D <sub>2</sub> | 1              | D <sub>3</sub> | D <sub>4</sub> |                | Π  | C <sub>6</sub> | C <sub>7</sub> |
| 10 | $D_0 D_1 D_2 D_3 D_4 D_5 T_6 C_7$                                                                                       | 10      | 0xE1                | Do             | D <sub>1</sub> | D <sub>2</sub> | 1              | D3             |                | D <sub>4</sub> |    | D <sub>5</sub> | C <sub>7</sub> |
| 11 | $D_0 D_1 D_2 D_3 D_4 D_5 D_6 T_7$                                                                                       | 10      | 0xFF                | Do             | D <sub>1</sub> | D <sub>2</sub> | 1              | D3             | I              | D <sub>4</sub> |    | D <sub>5</sub> | D <sub>6</sub> |

Figure 82-5-64B/66B block formats

## Alignment markers in 100GBASE-R

- 66-bit alignment markers are inserted into n=20 PCS virtual lanes
- In each virtual lane there are 2<sup>14</sup>-1 = 16383 66-bit blocks between two alignment markers
- For 100 Gb/s transmission over 4 physical lanes, each physical lane contains 5 virtual lanes and therefore 5 types of alignment markers (AM).





Alignment markers are special types of control blocks

| Bit Position: | 01 | 2 |    | 9 | 10             | 17 | 18 | 25             | 26 | 33               | 34 | 41 | 42 | 49 | 50 | 57 | 58 | 65   |
|---------------|----|---|----|---|----------------|----|----|----------------|----|------------------|----|----|----|----|----|----|----|------|
|               | 10 | Π | MO |   | M <sub>1</sub> |    |    | M <sub>2</sub> |    | BIP <sub>3</sub> |    | М4 | M5 |    | Me | i  |    | BIP7 |



| PCS<br>lane<br>number | Ę     | $\frac{\text{Encoding}^{a}}{\text{M}_{0},\text{M}_{1},\text{M}_{2},\text{BIP}_{3},\text{M}_{4},\text{M}_{5},\text{M}_{6},\text{BIP}_{7}}\}$ | PCS<br>lane<br>number | {     | $\frac{\text{Encoding}^a}{\text{M}_0,\text{M}_1,\text{M}_2,\text{BIP}_3,\text{M}_4,\text{M}_5,\text{M}_6,\text{BIP}_7\}}$ |
|-----------------------|-------|---------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|-------|---------------------------------------------------------------------------------------------------------------------------|
| 0                     | 0xC1, | 0x68, 0x21, BIP <sub>3</sub> , 0x3E, 0x97, 0xDE, BIP <sub>7</sub>                                                                           | 10                    | 0xFD  | 0x6C, 0x99, BIP <sub>3</sub> , 0x02, 0x93, 0x66, BIP <sub>7</sub>                                                         |
| 1                     | 0x9D, | 0x71, 0x8E, BIP <sub>3</sub> , 0x62, 0x8E, 0x71, BIP <sub>7</sub>                                                                           | 11                    | 0xB9, | 0x91, 0x55, BIP <sub>3</sub> , 0x46, 0x6E, 0xAA, BIP <sub>7</sub>                                                         |
| 2                     | 0x59, | 0x4B, 0xE8, BIP <sub>3</sub> , 0xA6, 0xB4, 0x17, BIP <sub>7</sub>                                                                           | 12                    | 0x5C, | 0x B9, 0xB2, BIP <sub>3</sub> , 0xA3, 0x46, 0x4D, BIP <sub>7</sub>                                                        |
| 3                     | 0x4D, | 0x95, 0x7B, BIP <sub>3</sub> , 0xB2, 0x6A, 0x84, BIP <sub>7</sub>                                                                           | 13                    | 0x1A, | 0xF8, 0xBD, BIP <sub>3</sub> , 0xE5, 0x07, 0x42, BIP <sub>7</sub>                                                         |
| 4                     | 0xF5, | 0x07, 0x09, BIP <sub>3</sub> , 0x0A, 0xF8, 0xF6, BIP <sub>7</sub>                                                                           | 14                    | 0x83, | 0xC7, 0xCA, BIP <sub>3</sub> , 0x7C, 0x38, 0x35, BIP <sub>7</sub>                                                         |
| 5                     | 0xDD  | 0x14, 0xC2, BIP <sub>3</sub> , 0x22, 0xEB, 0x3D, BIP <sub>7</sub>                                                                           | 15                    | 0x35, | 0x36, 0xCD, BIP <sub>3</sub> , 0xCA, 0xC9, 0x32, BIP <sub>7</sub>                                                         |
| 6                     | 0x9A, | 0x4A, 0x26, BIP <sub>3</sub> , 0x65, 0xB5, 0xD9, BIP <sub>7</sub>                                                                           | 16                    | 0xC4, | 0x31, 0x4C, BIP <sub>3</sub> , 0x3B, 0xCE, 0xB3, BIP <sub>7</sub>                                                         |
| 7                     | 0x7B, | 0x45, 0x66, BIP <sub>3</sub> , 0x84, 0xBA, 0x99, BIP <sub>7</sub>                                                                           | 17                    | 0xAD  | 0xD6, 0xB7, BIP <sub>3</sub> , 0x52, 0x29, 0x48, BIP <sub>7</sub>                                                         |
| 8                     | 0xA0, | 0x24, 0x76, BIP <sub>3</sub> , 0x5F, 0xDB, 0x89, BIP <sub>7</sub>                                                                           | 18                    | 0x5F, | 0x66, 0x2A, BIP <sub>3</sub> , 0xA0, 0x99, 0xD5, BIP <sub>7</sub>                                                         |
| 9                     | 0x68, | 0xC9, 0xFB, BIP <sub>3</sub> , 0x97, 0x36, 0x04, BIP <sub>7</sub>                                                                           | 19                    | 0xC0, | 0xF0, 0xE5, BIP <sub>3</sub> , 0x3F, 0x0F, 0x1A, BIP <sub>7</sub>                                                         |

#### Table 82-2-100GBASE-R Alignment marker encodings

<sup>a</sup>Each octet is transmitted LSB to MSB.

In each physical lane carrying data at 25 Gb/s there are 11 CB + 5 AM = 16 types of 66-bit control blocks

### 512B/513B transcoding latency

- Transcoding maps eight 66-bit blocks with a payload of 512 bits into one 513-bit block
- 512B/513B transcoding can be done
  - across all four physical lanes: low transcoding latency of 5.1 ns
  - across a single physical lane: medium transcoding latency of 20.4 ns
  - across a single virtual lane: high transcoding latency of 102 ns (unacceptable)
- Negligible latency for inverse 512B/513B transcoding (TC) at the receiver as inverse TC can be combined with FEC decoding. We assume that there are an integer number of 512B/513B transcoded blocks per FEC block.

Total transcoding latency:5.1 ns for TC across all 4 physical lanes20.4 ns for TC across a single physical lane

### 512B/513B transcoding

- A 512B/513B transcoding scheme was proposed for 40 Gb/s Ethernet in trowbridge\_01\_0707.pdf
- We can use 512B/513B TC for 100GBASE-R if there are not more than 16 types of control blocks
- All control blocks in a 513-bit transcoded block are at the top after reshuffling



#### BEFORE: 8 × 66B (512-bit payload)

### AFTER: 513 bits

### 4-bit encoding of control block type

For transcoding across a single physical lane there are 11 + 5 = 16 control block types per physical lane which can be encoded by 4 bits. In the table below, we assume that the five alignment markers in a physical lane are CB types 0xC1, 0xF5, 0xA0, 0x5C and 0xC4 from PCS virtual lanes 0, 4, 8, 12 and 16. Each physical lane has a separate 4-bit encoding table. However, all four 4-bit encoding tables agree in their first 11 entries.

| CB<br>type    | 0x1E | 0x78 | 0x4B | 0x87 | 0x99 | 0xAA | 0xB4 | 0xCC | 0xD2 | 0xE1 | 0xFF | 0xC1 | 0xF5 | 0xA0 | 0x5C | 0xC4 |
|---------------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|
| 4-bit<br>code | 0000 | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 | 1000 | 1001 | 1010 | 1011 | 1100 | 1101 | 1110 | 1111 |

11 control blocks

5 alignment markers

For transcoding across all four physical lanes (PL) a 4-bit encoding table can be constructed. In each PL lane there is 1 primary alignment marker p-AM (e.g. Marker 0 in PL0) and 4 secondary alignment markers (e.g. Markers 4, 8, 12, 16 in PL0) following p-AM. Two header bits of primary alignment markers are dropped. We can map the first byte of all 20 – 4 = 16 secondary alignment markers s-AM into a byte such that it indicates the last unused control block type. Therefore, we have a total of 11 CB + 4 \* 1 p-AM + 1 s-AM = 16 control block types and we can use 512B/513B transcoding.

### Total latency of transcoding and FEC

- Total latency of transcoding and FEC has two contributors
  - <u>At the transmitter</u>

Encoding latency = Transcoding latency + FEC encoding latency

- At the receiver

Decoding latency = Inverse transcoding latency + FEC decoding latency

- We assume FEC code is (N,K) block code over GF(2<sup>m</sup>) and all K m-bit symbols are available in a buffer. Minimum achievable latency for one-shot FEC encoding is 1 multiplication over GF(2<sup>m</sup>) + 4 \* four-input XOR gate delay for K < 258 and is therefore negligible.</li>
- FEC across all four physical lanes as proposed in wang\_01\_0511



Total latency ~ Transcoding latency + 2x to 3x FEC block latency

### Transcoding across a physical lane: Part 1



- RS(240,228) code with t=6 error correction capability and m=9 symbol size
- Properties
  - 10240 FEC code words within 1 alignment period satisfying alignment proposal in gustlin\_02a\_0511
  - 4 transcoded blocks within 1 FEC code word
  - 2.27% overclocking and 156.25 MHz clock multiplier is 168 <sup>3</sup>/<sub>4</sub>
  - total latency of 61 to 82 ns
  - 513-bit FEC striping
- Other FEC options
  - RS(244,228) FEC code with t=8, m=9
  - RS(248,228) FEC code with t=10, m=9

### Transcoding across a physical lane: Part 2



- RS(472,456) code with t=8 error correction capability and m=9 symbol size
- Properties
  - 5120 FEC code words within 1 alignment period satisfying alignment criterion
  - 8 transcoded blocks within 1 FEC code word
  - 0.6% overclocking and 156.25 MHz clock multiplier is 165 15/16
  - total latency of 102 to 143 ns
  - 513-bit FEC striping
- Other FEC options
  - RS(476,456) FEC code with t=10, m=9

### Transcoding across all physical lanes: Part 1



- RS(472,456) code with t=8 error correction capability and m=9 symbol size
- Properties
  - 5120 FEC code words within 1 alignment period satisfying alignment criterion
  - 8 transcoded blocks within 1 FEC code word
  - 0.6% overclocking and 156.25 MHz clock multiplier is 165 15/16
  - total latency of 87 to 128 ns
  - header bit of 513B transcoded block rotated across all 4 physical lanes

### Transcoding across all physical lanes: Part 2



- RS(352,342) code with t=5 error correction capability and m=12 symbol size
- Properties
  - 5120 FEC code words within 1 alignment period satisfying alignment criterion
  - 8 transcoded blocks within 1 FEC code word
  - 0% overclocking and 156.25 MHz clock multiplier is 165
  - total latency of 87 to 128 ns
  - header bit of 513B transcoded block rotated across all 4 physical lanes
  - 2<sup>\*</sup>t = 10 parity symbols not divisible among 4 physical lanes
  - In this example PL0 and PL2 have 86 information symbols + 2 parity symbols whereas PL1 and PL3 have 85 information symbols + 3 parity symbols
  - 13-bit error burst at DFE output always corrupts at most two symbols in one FEC code word



- RS(234,228) code with t=3 error correction capability and m=9 symbol size
- Properties
  - 10240 FEC code words within 1 alignment period satisfying alignment criterion
  - 4 transcoded blocks within 1 FEC code word
  - 0% overclocking and 156.25 MHz clock multiplier is 165
  - total latency of 61 to 82 ns
  - 513-bit FEC data striping, 15-bit FEC parity striping
  - 2\*t = 6 parity symbols not divisible among 4 physical lanes
  - end of FEC code word contains 2 parity symbols that are split into 4 physical lanes together with 6 dummy bits
    - $\blacktriangleright$  e.g., PL0 and PL2 can have one 9-bit parity symbol + 4-bit split parity + 2 dummy bits = 15 bits
    - $\triangleright$  e.g., PL1 and PL3 can have one 9-bit parity symbol + 5-bit split parity + 1 dummy bit = 15 bits
  - 10-bit error burst at DFE output always corrupts at most 2 symbols in one FEC code word

## **Code Comparison**

(1): TC across all 4 physical lanes(2): TC across a single physical lane

(3): 10GBASE-KR (4): bhoja\_01\_0911 (5): proposed by John Ewen, IBM

10 GHz / 64 = 156.25 MHz

| тс        | FEC                 | К    | N    | t         | m  | Line Rate | Latency                                                    | Overclocking | Multiplier of |
|-----------|---------------------|------|------|-----------|----|-----------|------------------------------------------------------------|--------------|---------------|
| 10        |                     | Γ.   | IN   | l         | m  | [Gb/s]    | [ns]                                                       | [%]          | 156.25 MHz    |
| 512B/513B | RS                  | 228  | 240  | 6         | 9  | 26.36718  | 46 - 67 <sup>(1)</sup><br>61 - 82 <sup>(2)</sup>           | 2.3          | 168 3/4       |
| 512B/513B | RS                  | 228  | 244  | 8         | 9  | 26.80664  | 46 - 67 <sup>(1)</sup><br>61 - 82 <sup>(2)</sup>           | 4            | 171 9/16      |
| 512B/513B | RS                  | 228  | 248  | 10        | 9  | 27.24609  | 46 - 67 <sup>(1)</sup><br>61 - 82 <sup>(2)</sup>           | 5.7          | 174 3/8       |
| 512B/513B | RS                  | 456  | 472  | 8         | 9  | 25.92773  | <b>87</b> – 128 <sup>(1)</sup><br>102 – 143 <sup>(2)</sup> | 0.6          | 165 15/16     |
| 512B/513B | RS                  | 456  | 476  | 10        | 9  | 26.14746  | <b>87</b> – 128 <sup>(1)</sup><br>102 – 143 <sup>(2)</sup> | 1.4          | 167 11/32     |
| 512B/513B | RS                  | 228  | 234  | 3         | 9  | 25.78125  | 46 - 67 <sup>(1)</sup><br>61 - 82 <sup>(2)</sup>           | 0            | 165           |
| 512B/513B | RS                  | 342  | 352  | 5         | 12 | 25.78125  | <b>87</b> – 128 <sup>(1)</sup><br>102 – 143 <sup>(2)</sup> | 0            | 165           |
| 64B/65B   | Fire <sup>(3)</sup> | 2080 | 2112 | 11B burst | 1  | 25.78125  | 20                                                         | 0            | 165           |
| 64B/65B   | RS <sup>(4)</sup>   | 208  | 224  | 8         | 10 | 27.34375  | 41 – 61                                                    | 6.1          | 175           |
| 64B/65B   | RS <sup>(4)</sup>   | 416  | 448  | 16        | 10 | 27.34375  | <b>82</b> – 123                                            | 6.1          | 175           |
| 64B/65B   | RS <sup>(4)</sup>   | 104  | 112  | 4         | 10 | 27.34375  | 20 – 31                                                    | 6.1          | 175           |
| 64B/65B   | RS <sup>(5)</sup>   | 260  | 272  | 6         | 10 | 26.56250  | 51 – 77                                                    | 3            | 170           |
| 64B/65B   | RS <sup>(5)</sup>   | 260  | 280  | 10        | 10 | 27.34375  | 51 – 77                                                    | 6.1          | 175           |

512B/513B Transcoding and FEC for 100 Gb/s Backplane and Copper Links



- 512B/513B transcoding reduces line rate, overclocking, channel loss and power consumption when compared to 64B/65B transcoding
- Several FEC options for 512B/513B transcoding were proposed and compared to each other and to other FEC proposals in terms of total latency, line rate, overclocking, capability to correct errors and multiplier of 156.25 MHz clock
  - RS(352,342) code with m=12 correcting t=5 symbols has 0% overclocking where multiplier of 156.25 MHz clock is 165. Transcoding can be done across all four physical lanes or across a single physical lane.
  - Low-latency RS(234,228) code with m=9 correcting t=3 symbols has 0% overclocking where multiplier of 156.25 MHz clock is 165. Transcoding can be done across all four physical lanes or across a single physical lane.
- Performance evaluation/comparison of proposed coding schemes in terms of bit error rate at the output of FEC decoder for various types of channels will guide the selection of a suitable TC/FEC scheme for 100 Gb/s backplane and copper links