Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [EFM] RE: Pause frame usage in transport networks




Roy,

I'm not going to discuss your answer in full detail - at least not Friday 
night :-) I have a few points to make, though:

- I agree that using PAUSE in the local link is going to work better than 
PAUSE end-to-end. This is intuitive, in my opinion.

- In my opinion - which is based both on my practical experience and 
theorethical understanding - PAUSE is only good for short bursts. PAUSE has 
little effect for long bursts of traffic, because one can hold on for very 
little time before filling up the transmission queue, which forces the 
equipment to discard frames.

- For slightly congested links, any form of active flow control using closed 
loops will tend to be less effective than the simple use of PAUSEs. That's 
because end-to-end flow control needs more time to react, and hence is a 
little less efficient for brief periods of congestion, but is (probably) more 
stable for longer periods of congestion.

- In some situations, PAUSE is going to be very effective - for example, with 
slow CPEs that need only a brief period of time to be able to receive one 
more frame. Interestingly enough, this is one of the common scenarios for EFM 
deployment (inexpensive-and-slow CPEs)

- And last, but not least... I'm not saying that PAUSE is not useful. Only 
that PAUSE should be regarded as the last safety device to avoid a congested 
link from being even more congested. But then, a lot of consideration has to 
be taken when designing the network. Relying on PAUSE is not the way I would 
do it in my network (had I the money to deploy it :-).

On Friday 21 February 2003 21:00, Roy Bynum wrote:
> Carlos,
>
> You said:
> "Roy, I'm really lost here. By definition, we are talking about a
> technology that is frame based. And as far as I am concerned, PAUSE is only
> effective for frame/packet[1] based services that are not time
> sensitive[2]. Most applications that demand leased-line style circuits are
> time sensitive. For these applications it is better to simply drop late
> frames - it does not good to deliver them late, as they will be not useful
> anymore. Once you lose a few frames it's up to the upper protocol to
> reconstruct or infer in some way the information lost to keep the
> application running.
> That leaves us with non-time sensitive applications that still require
> leased-line service. In this case, we're probably talking about premium
> services, that have be carefully managed, and that probably will be
> configured with conservative bandwidth reservations. In this scenario,
> PAUSE is also probably not that useful, or at least not desirable as the
> main congestion management system. As I said, I think that PAUSE acts as a
> last safety net for abnormal situations - it is useful, it has to be
> present, but no engineer would ever be comfortable relying on it as the
> main safety device on the system.
> Am I missing something here?"
>
> Actually, you are.  We did a lot of testing of active flow control, both in
> the local link and in long haul end to end services.  What we found may not
> be intuitively apparent.
>
> In the local only link for P2P leased line services,  active flow control
> in the short distance of the link acted to gracefully reduce the traffic
> transfer rate without effecting latency variance (Base line latency
> variance of EOS is 125us because of the OAM frame window.) and prevented
> data loss when attempting to push over the transmission line supportable
> transfer rate bandwidth.  We tested this by turning flow control rate
> adaptation off and on and comparing the results at 90% utilization.  We did
> find that, depending on the distance between the transmission node and the
> customer equipment, the available transfer rate was slightly reduced with
> active flow control turned on.  Without active flow control rate
> adaptation, the transfer rate would go to 98/99% utilization and then
> dramatically fail do to frame loss as the available transfer rate was
> exceeded.  At this point, retransmissions reduced the effective transfer
> rate to well below the effective rate provided by the use of active flow
> control.
>
> The dropped packets and retransmissions also dramatically increased the
> latency variance relative to the end to end distance of the P2P leased line
> link.  Any form of rate adaptation that uses "dropped" packets greatly
> effects latency variance which gets translated to apparent end to end
> latency by any application that is communicating across the P2P leased line
> link.
>
> Regardless of the protocol that is used, all TDM transmission facilities
> induce a 125us latency variance at the customer data link layer.  This is
> because the transmission convergence mapping is based on a specific payload
> window that occurs every 125us  in SONET framing.  In testing we found that
> GbE has a baseline worst case latency variance of 12us regardless of the
> vendor and 100base T has a latency variance of 120us regardless of the
> vendor.  We found that the latency variance of multiple switches in series
> with multiple traffic paths through the switches, the end to end latency
> variance tended to increase at a somewhat RMS value instead of a directly
> additive value.
>
> Using pause frames end to end tended to greatly reduce the available data
> transfer rate in a P2P leased line link.  Compared to 95% -97% that was
> achieved when using flow control only in the local link at each end, end to
> end 60% utilization was a dramatic reduction.  But since there were no
> retransmissions, that 60% utilization was the true effective
> utilization.  Depending on the way that the pause frame operands are
> defined, additional latency variance that is directly related fixed
> distance latency can be induced.  Properly configured pause frame operands
> tended to provide the same latency variance as the local only use,
> 125us.  We can talk off line as to what those configuration differences
> are.
>
> Basically, what I found in over 2 years of testing was that the use of
> 802.3x active flow control actually improved the performance and
> reliability of data communications better than any other type of rate
> adaptation.  For applications that are severely time critical, like the SNA
> scanning applications used by banks, the use of Ethernet as the Data Link
> protocol, with Ethernet over SONET and active flow control in the local
> link of P2P leased lines is perhaps the best possible solution.  Over a
> period of time, as more of the EOS technology gets rolled out, vendors that
> provide X.86 will be able to prove that they provide a dramatic improvement
> in performance and reliability over other EOS technologies and older
> technologies based on HDLC/SDLC/PPP as the Data Link protocol.
>
> Thank you,
> Roy Bynum
>
> At 07:59 PM 2/21/2003 -0300, Carlos Ribeiro wrote:
> >Roy, I'm really lost here. By definition, we are talking about a
> > technology that is frame based. And as far as I am concerned, PAUSE is
> > only effective for frame/packet[1] based services that are not time
> > sensitive[2]. Most applications that demand leased-line style circuits
> > are time sensitive. For these applications it is better to simply drop
> > late frames - it does not good to deliver them late, as they will be not
> > useful anymore. Once you lose a few frames it's up to the upper protocol
> > to reconstruct or infer in some way the information lost to keep the
> > application running.
> >
> >That leaves us with non-time sensitive applications that still require
> >leased-line service. In this case, we're probably talking about premium
> >services, that have be carefully managed, and that probably will be
> >configured with conservative bandwidth reservations. In this scenario,
> > PAUSE is also probably not that useful, or at least not desirable as the
> > main congestion management system. As I said, I think that PAUSE acts as
> > a last safety net for abnormal situations - it is useful, it has to be
> > present, but no engineer would ever be comfortable relying on it as the
> > main safety device on the system.
> >
> >Am I missing something here?

-- 


Carlos Ribeiro
cribeiro@xxxxxxxxxxxxxxxx