Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

RE: Link Status thoughts




Rich,

(To other readers:
Fault sensing and reporting has a lot of layers of complexity so you'll
have to bear with the fact that this memo wanders over a number of
related topics.)

You state: "The Idle stream alone, with
no handshakes, is sufficient for link initialization when both DTE's
power up/reset simultaneously. "

In my opinion, each of three signals: the idle stream, the remote 
fault stream, and, if adopted, the local fault stream should be 
be sufficient for link initialization. At this point, all fault
signaling schemes on the table have that property. Lock can even be
obtained while receiving a mixed stream of idle and packets though it 
may take a bit longer.

Given that, I'm a bit puzzled by your bring up scenario particularly
step 6.
"1) The Local Device powers up first, resets, and sends Idles;
2) Since the Link Partner is powered down, it transmits nothing;
3) The Local Device receives nothing and indicates Remote Fault to it's
Link Partner;
4) The Link Partner powers up, resets, and sends Idles;
5) The Local Device receives Idles and stops indicating Remote Fault to
it's Link Partner. The Local Device sets Link Status to OK;
6) The Link Partner may see Remote Fault or Idle first. Reset timers
should ensure that the residual Remote Fault indication doesn't cause a
problem. The Links Partner essentially sees Idle after reset. The Link
Partner sets Link Status to OK;
7) MAC frames can now flow over the link."

Remote Fault indications should not cause any trouble and reset
timers shouldn't be necessary to ensure that. What I think the scenario
should be is, (and I'm going to used device A and device B, calling one
local device and one link partner is confusing because they are peers and
those names are only relevant to a specific view point):
1) Device A powers up first and sends Idles
2) Since Device B is down, Device A receives nothing and transmits Remote
  Fault
3) Device B powers up. By the time it has finished reset, it may or may
  not have achieved sync. Therefore it may start sending Remote Fault
  or Idle. We don't know and it doesn't matter.
The following two steps may happen in either order
4A) Device A achieves sync and sends Idle
4B) Device B achieves sync and sends Idle
5) Device A and B have now detected that they are receiving Idle, the 
link is up and everyone knows it is working.

The bring up has to also work when both devices are up but not connected
and the connection. It has to work when a nasty noise hit causes one or
both sides to lose lock. Therefore, it is important that it not rely
on one of them being reset. It has to come up when both sides have detected
a fault and are sending remote fault.

I have not been following Fibre Channel so I don't know the details of your
proposal that was made there. There appear to me to be two choices to insert
a fault indication with random /A/ spacing. Either replace the /A/ with the
fault indication or put the fault indication in after the /A/. I slightly
prefer the latter because then the sync process only has to align on /A/
and it can align in the presence of the fault indication.

I do not see any point to being able to send RF and LF at the same time.
To detect RF you need to have obtained sync to the incoming signal. 
Therefore, if you detect RF in band then an LF condition does not 
exist. If an LF condition exists you cant detect RF. They are mutually
exclusive.

I can see one purpose to a Break Link signal. In a perfect world, it
wouldn't be needed. In an imperfect world, there may be a case where
the something in the remote node has locked up and the remote node
isn't detecting that it has a problem. For instance a jabber problem.
I could see allowing a Break Link signal initiated by management to
cause a reset to the device at the other end of the link using the
lowest possible level of communication. I'm not convinced it is needed,
but if people want it, I'd be willing to allow a mechanism for it.

Toward the end of your memo, you assert that every link element must
be able to recognize a local fault condition and forward it and seem
to imply that it must be done in-band. I don't think it is reasonable
for 10GBASE-R PMDs and PMAs to do this inband. They would need almost
all of the PCS layer to do it. Even for 10GBASE-X PMDs and PMAs it
seems an excessively burdensome requirement. We should specify a pin
at the XSBI for forwarding this information.

I generally support Shimon's proposal, but I would like to eliminate
the handshaking aspect. There just hasn't been time to get full concensus
on that.

Also, there has been a lot of focus on how to encode the signals but really
there are a lot of other issues that we really have just been coming
to understand like where does fault sensing live?

These discussions especially over the past week have been very helpful
and I'm happy to work with all parties on coming to a concensus 
proposal.

Pat


-----Original Message-----
From: Rich Taborek [mailto:rtaborek@xxxxxxxxxxxxx]
Sent: Wednesday, November 01, 2000 10:39 PM
To: HSSG
Subject: Re: Link Status thoughts



Pat,

Once again, I agree with your general architecture for Link Status
reporting. I agree that your choices of candidate signals for the PHY.
To reiterate from your note these consist of:

- Remote Fault
- Local Fault
- Idle = !(RF + LF)

You also list "yet another signal" and I strongly agree that there is no
need for any other signals. This includes Break Link, a signal for which
I cannot determine a valid purpose. The initialization process works
just fine without Break Link. Break Link seems to be an artifact of
Auto-Negotiation. Since AN is not a 10GE objective, there is no
requirement for Break Link. 

Idle is issued during link initialization and continuously between
packets during normal operation in the absence of errors in the Local
and Remote DTE's as well as connecting link. The Idle stream alone, with
no handshakes, is sufficient for link initialization when both DTE's
power up/reset simultaneously. If one one DTE powers up before the
other, link initialization is a bit more complex and looks something
like this:

1) The Local Device powers up first, resets, and sends Idles;
2) Since the Link Partner is powered down, it transmits nothing;
3) The Local Device receives nothing and indicates Remote Fault to it's
Link Partner;
4) The Link Partner powers up, resets, and sends Idles;
5) The Local Device receives Idles and stops indicating Remote Fault to
it's Link Partner. The Local Device sets Link Status to OK;
6) The Link Partner may see Remote Fault or Idle first. Reset timers
should ensure that the residual Remote Fault indication doesn't cause a
problem. The Links Partner essentially sees Idle after reset. The Link
Partner sets Link Status to OK;
7) MAC frames can now flow over the link.

Note that Break Link is not needed during Link Initialization and only
serves to complicate the protocol.

However, Remote Fault serves a clear purpose during initialization.
Remote Fault is the report of absence of signal or sync at the DTE
receiver, over the transmitter of the same DTE. If you agree with the
initialization protocol above, then Remote Fault should be signaled
along with, or instead of, Idle whenever the opposite direction simplex
link is not operational.

Remote Fault is a signaling protocol. The protocol must be recognized at
the Link Partner and distinguished from the normal operational signal,
which is Idle (if you agree with the protocol above). I believe that
there is no disagreement that protocol synchronization at the receiver
is required in order to recognize any signal or message. If the Local
Device signals Remote Fault instead of Idle, the Link Partner will
either receive nothing or recognize the Remote Fault signal. "In
between" conditions, such as receiving a message on fewer than all lanes
(i.e. alignment not required), is not useful as an end-to-end signaling
protocol. The latter case denotes a non-operational link. I prefer that
Remote Fault signaling protocol be either LSS or an alternating
Idle/Sequence. The Sequence protocol I've proposed for use for 10
Gigabit Fibre Channel is an example of a an alternating Idle/Sequence
where the Sequence alternates with random AKR for random /A/ spacing
intervals. 

On to Local Fault: Local Fault, as you suggest, is the condition and
signal associated with the Loss_of_Signal or Loss-of-Sync, etc.
conditions. Local Fault may be recognized and signaled up stream by any
link element including intermediate link elements such as an XGXS, PMD
retimer, WIS, etc. *** Note that this is a key decision point *** 10GE
architecture already specifies intermediate link elements. A 10GE link
may employ multiple such link elements. Each of these link elements
should be capable of both recognizing and forwarding Local Fault
conditions. If not, these elements become part of the problem rather
than the solution. The next question is whether the Local Fault signal
is in-band or out-of-band. Since pins=BAD, this seems like an
implementation issue, and Local Fault needs to be able to go over the
medium, it seems that Local Fault must at least be specified as an
in-band signal. The signaling requirements for Local Fault appear to be
identical to those of Remote Fault. Consequently, I propose that the
Local Fault signaling protocol be either LSS or an alternating
Idle/Sequence. 

The most complex scenario involving Remote Fault and Local Fault that I
can envision is one where both Remote Fault and Local Fault conditions
exist for the same link, in the same direction. For the sake of
completeness, the Remote Fault and Local Fault transport protocol should
allow for the indication of all combinations of these two conditions.
Note that this scenario is not covered in the latest RF/BL proposal from
Shimon Muller.

I'd be honored to work on a common Link Initialization and Link Status
Reporting protocol presentation with you for Tampa.

Best Regards,
Rich
    
--