Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [8023-CMTF] priority/class based "buffer credit" frame instead of PAUSE frame



Title: priority/class based "buffer credit" frame instead of PAUSE frame
Jian,
 
I very much appreciate your suggestion of credit-based flow control for Ethernet, instead of a ON/OFF protocol like class-based PAUSE.  The idea of preventing buffer overflows proactively rather than reactively is a powerful widely recognized idea for data fabrics!  I'm glad to see it being discussed on the context of Ethernet. 
 
I would add a note that credit-based flow control can be implemented with simple buffer sharing schemes that can make very effective use of buffer memory while minimizing congestion spreading.  I've not talked about credit-based schemes for Ethernet flow control because of the pervasive PAUSE implementations in existing silicon.  It would be easier for the silicon designers to just extend the PAUSE concept up to the priority queues, than if would be to introduce new functionality, no matter how effective it would be. 
 
...Jeff
 


From: owner-stds-802-3-cm@LISTSERV.IEEE.ORG [mailto:owner-stds-802-3-cm@LISTSERV.IEEE.ORG] On Behalf Of Jian Liu
Sent: Sunday, July 31, 2005 7:57 AM
To: STDS-802-3-CM@LISTSERV.IEEE.ORG
Subject: [8023-CMTF] priority/class based "buffer credit" frame instead of PAUSE frame

Asif and Bob,

After reading your proposal at http://www.ieee802.org/3/ar/public/0507/brunner_1_0507.pdf, I wondered if it is better to change the semantic of PAUSE frame and instead use it to pass buffer fill level information from switch to end-node:

A switch sends to each end-node special "PAUSE" frame that embeds buffer credit per priority/class. Initially each end-node has 0 "buffer credit". Each "PAUSE" frame increases these "buffer credits". An end node consumes 1 "buffer credit" of the corresponding priority/class and gets blocked on that particular priority/class if the corresponding "buffer credit" count becomes 0.

A well behaved end-node that follows this scheme can fully utilize buffers allocated to it in the switch, that is, it can use up all the buffers allocated to it in the switch without introducing loss, which is better than the switch node has to throttle the end node once the corresponding buffer is close to full.

It's probably better to have each buffer credit to allow end node to send up to 1522 bytes instead of 9K to better utilize buffers in the switch.

A switch may need to accumulate "buffer credits" before send any "buffer credit" frame to an end node to reduce overhead of such frames.

One way to embed the information can be as follows:
DA=01_80_C2_00_00_01
SA=source MAC
Length/Type=88_08
MAC Control Opcode=00_02 // instead of PAUSE (00_01)
MAC Control Parameters=Total of newly available buffer credits
Class 0 newly available buffer credits // 2 bytes
...
Class 7 newly available buffer credits // 2 bytes
Reserved // 42-16=26 bytes
FCS


By the way, this "buffer credit" scheme is pretty much how Fibre Channel works to avoid frame loss. It can easily implemented in silicon and enables SCSI directly over Ethernet just like SCSI over FC, hardware wise.

Regards,
Jian