The Split-Band Proposal for Gigabit Ethernet over 100m UTP-5 Sailesh K. Rao Silicon Design Experts, Inc. Ph: (908)-972-0707 x11 e-mail: sailesh@sde.com IEEE 802.3 GTF, Couer D'Alene, ID. September 9-11, 1996. 1. Main Topics - Overview of the Split Band Proposal - Motivation for the Split Band Approach - The Physical Coding Sublayer - The Physical Medium Attachment Sublayer - Emissions Compliance and EMI Susceptibility - Towards a Consensus for the Gigabit Ethernet Copper PHY standard. 2. Overview of the Split Band Proposal 3. Objectives of the Split Band Proposal - Achieve 1Gb/s Full-Duplex over 100m 4-pair UTP-5 with a BER of 10^-10 - under John Creigh's extreme worst-case conditions - with reasonable complexity PHY transceiver - Maximize reuse of Transceiver Blocks for 100BASE-Tx or 100BASE-T2 fallback implementations. - Facilitate implementation of "no-brainer" plug-in features: - Polarity, pair-swap, differential delay etc. - Maximize use of DSP components to ease future process migration: - Gigabit Ethernet NIC on a CMOS chip 4. Organization of the Split-Band Approach - The Physical Coding Sublayer - Transmit Section: Converts raw 1Gb/s bits into 25MBaud Split-Band symbols - Receive Section: Converts split-band symbols back to properly aligned bits - The Physical Medium Attachment Sublayer - Transmit Section: Converts Split-Band symbols to differential voltages on the wire. - Receive Section: Processes received differential voltages back to Split-Band symbols, while compensating for impairments. 5. PCS Main Features - State-synchronized 4-pair Cipher-Text scrambler - Simple descrambler state recovery - Simple pair-swap detection and correction - Simple polarity detection and correction - Simple differential delay detection and compensation - Gray-coded constellations throughout - Nearest neighbor symbol-error leads to ONE bit error - Loss of End-of-Packet delimiter leads to continuos invalid symbols during idle. - Large excess code-space for creative uses during idle. - Continuos receiver performance monitoring during idle. 6. PMA Sublayer Main Features - Spectral energy on the line confined to below 62.5 MHz - Less stringent hybrid bandwidth requirements - Uses maximum clock rate of 125MHz - Compatible with proposed MII - Compatible with 100BASE-Tx - Simplifies analog circuitry - 4-pair average PSD comparable to a single 100BASE-Tx transmitter, assuming 4V peak-to-peak 1000BASE-T signal. - Lower and Upper Bands operate at 25Mbaud symbol rate. - Compatible with 100BASE-T2. - Digital Split-Band composition and decomposition. - Uses exactly one 125MHz DAC and only one 125MHz ADC per wire-pair. 7. Block Diagram of Transmitter 8. Block Diagram of Receiver 9. Post San-Diego Changes - Then: Optimized for SNR in the presence of self-NEXT - Now: Optimized for System Complexity 10. Motivation for the Split-Band Approach 11. Factors that Necessitate Splitting - EMC Considerations - Ability to tailor upper band energy for emissions - Dynamic Range of Line Equalization - DSP Complexity of Line Equalization - Echo Cancellation Requirements - DSP Complexity of Echo Cancellation - Self-NEXT Cancellation Requirements - DSP Complexity of self-NEXT cancellation - Complexity of Clock Recovery 12. Dynamic Range of Line Equalization - Dynamic range dictates coefficient precision requirements and therefore translates to cost. - Without splitting, we would need 27dB over 100MHz BW. - In split-band approach, lower band needs (13+9) dB and upper band needs (7+3) dB. [Figure for dynamic range] 13. DSP Complexity of Line Equalization - Each band in the Split-Band approach has a much lower group-delay distortion across the band. - Attenuation roughness should not be an issue in both bands. - Equalizer lengths can be reduced. ------------------------------------------------------------------------------------ | Approach | No. of Bands | Symbol Rate | Oversampling | Eq. Length | Complexity | ------------------------------------------------------------------------------------ | Split Band | 2 | 25 MHz | 5X | 4T | 1 Billion | ------------------------------------------------------------------------------------ | PAM3x3 | 1 | 83.33 MHz | 3X | 9T | 2.25Billion| ------------------------------------------------------------------------------------ - Assume 9T equalizer (from Wakefield/GEA presentation) is sufficient even in the presence of attenuation roughness. - Difference of 1.25 Billion Units (multiply-adds) is significant. 14. Echo Cancellation Requirements - Worst-case Echo cancellation requirements can be estimated in the frequency domain with ECR(f) = A(f) - ERL(f) + Required_SNR + Desired Margin [Figure for Echo Cancellation Requirements] - Need more bits to represent Echo canceller coefficients if we don't split. 15. DSP Complexity of Echo Cancellation - ASSUMPTIONS: - Echo Canceller span must be at least 1.2 microseconds since worst-case reflection losses of far-end connectors cannot be ignored if attenuation of wiring happens to be LOW (e.g. Ultra-CAT5, CAT-6) - Complexity of Echo Canceller tap for split-band approach is less than 2X the complexity for PAM3X3 ------------------------------------------------------------------------------------ | Approach | No. of Bands | Symbol Rate |Tap Complexity| EC. Length | Complexity | ------------------------------------------------------------------------------------ | Split Band | 2 | 25 MHz | 2X | 30T | 3 Billion | ------------------------------------------------------------------------------------ | PAM3x3 | 1 | 83.33 MHz | 3X | 100T | 8.33Billion| ------------------------------------------------------------------------------------ - Difference of 5.33 billion units (additions) is significant. 16. self-NEXT Cancellation Requirements - Worst-case self-NEXT cancellation requirements can be estimated in the frequency domain with NCR(f) = A(f) - Next_Loss(f) + Required_SNR + Desired Margin [Figure for Echo Cancellation Requirements] - Need more stringent self-NEXT cancellation if we don't split. 17. DSP Complexity of self-NEXT Cancellation - ASSUMPTIONS: - self-NEXT Canceller span is 0.4 microseconds for both Split-Band approach and PAM3X3 approach. - Complexity of NEXT Canceller tap for split-band approach is less than 2X the complexity for PAM3X3 ------------------------------------------------------------------------------------ | Approach | No. of Bands | Symbol Rate |Tap Complexity| NC. Length | Complexity | ------------------------------------------------------------------------------------ | Split Band | 2 | 25 MHz | 2X | 3 X 10T | 3 Billion | ------------------------------------------------------------------------------------ | PAM3x3 | 1 | 83.33 MHz | 3X | 3 X 33T | 8.33Billion| ------------------------------------------------------------------------------------ - Difference of 5.33 billion units (additions) is significant. 18. Complexity of Clock Recovery - Split-Band approach: Recover 25MHz symbol clock from lower band data using decision-directed algorithm. - Lower Band SNR is positive even under worst-case conditions [Figure to support claim] - Trickier situation for 83.33MHz symbol clock recovery in PAM3x3 approach. [Figure to support claim] 19. Conclusions - From a technical standpoint, the split-band proposal achieves the desired objectives at a much lower cost. - half the complexity for line equalization. - one-third the complexity for echo cancellation. - one-third the complexity for self-next cancellation. - much simpler timing recovery. - The only drawback is the slight increase in transmitter complexity. - negligible compared to the advantages gained in receiver complexity. - Technically, one can split to more than 2 bands and improve on the gains, but - this increases latency to unacceptable levels. 20. The Split Band Proposal for Gigabit Ethernet over 100m UTP-5 The Physical Coding Sublayer 21. PCS in a Nutshell - Transmit Section - Parallel-parallel conversion from 125Mhz X 8 bits MII word to 25MHz X 10bits X 4pairs. - Scramble 10bit words with state-synchronized cipher-text scrambler and map to lower band and upper band symbols - Generate ESC symbols for SOP/EOP - Active Idle encoding - Receive Section - Recover descrambler state, polarities, pair-identities, differential delays. - align symbols, decode, descramble to get raw bits. - parallel-parallel conversion from 25MHz X 10bits X 4pairs to 125MHz X 8bits MII word. 22. Parallel-parallel conversion [Figure] 23. Symbol Encoding - DATA mode [Figure] 24. Lower Band Constellation - DATA mode [Figure] 25. Upper Band Constellation - DATA/IDLE mode [Figure] 26. SOP/EOP Generation - Generate Lower Band ESC symbols on appropriate wire-pairs for SOP/EOP. - Assume 2-byte SOP/EOP is acceptable. [Figure] 27. State-synchronized Cipher-Text scrambler - Features: - Scramblers for wire-pairs A, B, C and D obey the generating polynomial: - 1 + x^13 + x^33 (Master PHY) - 1 + x^20 + x^33 (Slave PHY) - identical to 100BASE-T2. - The states are statistically uncorrelated with respect to each other, but have large fixed delay relationship between them - they are synchronized. - The fixed relationship ensures ease of differential delay compensation at the receiver. 28. State Synchronized Cipher-Text scrambler [Figure] 29. Random Word Generator - To scramble the 10-bit data. [Figure] - To randomize the sign of the lower-band symbols. [Figure] 30. Active Idle Signalling - In the upper band, simply encode the random word and transmit. - In the lower band, send a small subset of points (as in 100BASE-T2) so that - the signal on the line is statistically equivalent to that sent in data mode. - the scrambler state is encoded and can be easily recovered at the receiver. - pair identity is transmitted continuosly. - polarity is transmitted continuosly. - Receiver OK/not_OK information is transmitted continuosly. 31. Lower Band Constellation - IDLE Mode [Figure] 32. Transmitting Scrambler State [Figure] 33. Transmitting Other Info [Figure] 34. Conclusions - Simple PCS encoding and decoding scheme allows for - ease of descrambler state recovery - automatic polarity detection and correction - automatic differential delay detection and correction - automatic pair identity detection - continuos receiver status transmission - Similar to 100BASE-T2 scheme - Large excess code space for other creative uses during idle - Ease of idle-mode performance monitoring 35. The Split Band Proposal for Gigabit Ethernet over 100m UTP-5 The Physical Medium Attachment Sublayer 36. PMA in a Nutshell - Transmit Section - Convert Lower Band and Upper Band symbols into differential voltages on the wire - digital square root raised cosine pulse shaping - 125 MHz 8-bit DAC - 70MHz 4th order Butterworth filter for EMC - Hybrid transformer coupling onto UTP-5 cabling to ensure continuos full-duplex operation. - Receive Section - Convert differential voltages to Lower Band and Upper Band symbols - receive filtering - 125MHz 7-bit ADC - adaptive DFE, self-NEXT and echo cancellation 37. PMA Transmit Block Diagram - 20% excess bandwidth, 15MHz center frequency, square-root raised cosine filter in lower band - 20% excess bandwidth, 47.5MHz center frequency, square-root raised cosine filter in upper band [Figure] 38. Transmit Spectrum with Adjacent Channel Interference (ACI) revealed. [Figure] 39. PMA Receive Block Diagram [Figure] 40. Startup - Blind startup in the lower band - widely separated points during idle - positive SNR even in the presence of worst-case impairments - ease of descrambler state recovery - little point received - descramber bit is ONE - big point received - descrambler bit is ZERO - Once descrambler state, polarity etc. is recovered, do reference-directed startup in upper band 41. Example of Startup Simulation [Figure] 42. Received Lower Band Idle Constellation [Figure] 43. Received Upper Band Idle Constellation [Figure] 44. John Creigh's Extreme Worst Case Simulations - Extreme Worst-case conditions - Temperature of Cable is 60 deg. C - 100m Worst-Case attenuation cable - 3 worst-case self-NEXT couplings - used SDE[0:2] curves - 1 worst-case echo coupling - used SDE echo curve - Split-Band Receiver parameters - 4-pole 2-zero receive filter - 7-bit ideal 125MHz ADC - 5th order separation filters each using 5 adders - 4T T/5 DFE with 4T feedback taps, 30T echo cancellers, 10T self-NEXT cancellers - Receiver parameters chosen to achieve 10^-10 BER under these conditions 45. Extreme Worst-Case Simulation - Lower Band [Figure] 46. Extreme Worst-Case Simulation - Upper Band [Figure] 47. Extreme Worst-Case Simulation - Data Mode [Figure] 48. Extreme Worst-Case Simulation - BL NEXTs [Figure] 49. Extreme Worst-Case Simulation - BL NEXTs - 16T NC [Figure] 50. Conclusions - While both the Split-Band receiver and PAM3X3 receiver can be made to have positive margin under extreme worst-case conditions, - Split-Band receiver uses half the complexity for line equalization - Split-Band receiver uses one-third the complexity for echo cancellation. - Split-Band receiver uses one-third the complexity for NEXT cancellation. - Simple startup - Transmitter and Receiver can be employed to do 100BASE-Tx/T2 operation: - self-NEXT cancellation improves performance of 100BASE-Tx solution. 52. The Split Band Proposal for Gigabit Ethernet over 100m UTP-5 Emissions Compliance and EMI Susceptibility 53. Approach - Emissions Compliance - match transmit spectrum with respect to 100BASE-Tx - EMI Susceptibility - Crane test 54. 4V pk-pk one-pair 1000BASE-T vs. 100BASE-Tx [Figure] 55. 4V pk-pk four-pair 1000BASE-T vs. 100BASE-Tx [Figure] 56. 3V pk-pk four-pair 1000BASE-T vs. 100BASE-Tx [Figure] 57. EMI Susceptibility: Crane Test 58. Achievements in the Split Band Proposal - Achieved 1Gb/s full-duplex over 100m 4-pair UTP-5 with a BER of 10^-10 - under John Creigh's extreme worst-case conditions - with reasonable complexity transceiver - Maximized reuse of Transceiver Blocks for 100BASE-Tx or 100BASE-T2 fallback implementations - Facilitated implementation of "no-brainer" plug-in features: - polarity correction, pair-swap correction, differential delay compensation etc. - Maximized use of DSP components to ease future process migration: - maximum clock rate of 125MHz - possible in standard production CMOS processes today. - Gigabit Ethernet NIC on a CMOS chip is not a pie-in-the-sky concept.