Multiple Spanning Trees in 802.1Q Norman Finn, Cisco Systems 1.0 Introduction A number of means for adapting the spanning tree algorithm of 802.1D to the virtual LANs of 802.1Q have been discussed. At least three means have been proposed for employing one or more spanning trees in a single installation: 1. One spanning tree encompassing all physical links; 2. Any number of parallel spanning trees, up to the point of one spanning tree per VLAN; and 3. One super spanning tree for the "backbone" that connects all switches, with one or more sub-spanning trees for individual VLANs. With a suitable choice of rules for the interaction between 802.1Q tags and 802.1D Bridge Protocol Data Units (BPDUs), the first two of these choices can be shown to be equivalent. No new protocols need be invented. There are many ways of implementing multiple spanning trees in the context of 802.1Q. Section 2.0 gives the definition of "multiple spanning trees" used in this proposal. Section 3.0 lists the specific advantages that may be obtained if multiple spanning trees are allowed. Section 4.0 proposes a specific set of rules for the interaction between 802.1D and 802.1Q. Section 5.0 demonstrates how these rules allow a continuum of implementations and installations that range from one spanning tree per network to one spanning tree per VLAN. Section 6.0 discusses the costs of standardizing and implementing the rules of Section 4.0, and the work items for rev 2 of 802.1Q that these rules suggest. 2.0 Definition of Multiple Parallel Spanning Trees The central idea driving this contribution is that the ability to operate multiple spanning trees in parallel offers advantages which outweigh a very small cost to the complexity of the standard. An example of multiple parallel spanning trees is shown in Figure 1. In this diagram, three switches A, B, and C are connected via four physical links 1, 2, and 3. Two VLANs, "red" and "blue", are configured. The red VLAN is blocked at switch B's port to physical link 1, and the blue VLAN is blocked at switch C's port to physical link 3. For traffic on the red VLAN to get from switch A to switch B, it must first traverse switch C. Blue traffic, however, can move directly from A to B. Similarly, red traffic can flow directly between A and C, while blue traffic must traverse switch B. FIGURE 1. Example of Multiple Spanning Trees (PostScript only) If one spanning tree were used on the physical links between switches, 802.1D would block all traffic at one physical port to one switch. Traffic between any two switches would take the same path, regardless of which VLAN it is using. This is the essential difference between one and more than one spanning tree; whether the port blocked by 802.1D is a physical port to a physical link, or a logical port to a VLAN. 3.0 Advantages of Multiple Spanning Trees Before discussing how to make multiple spanning trees work, let us look at the reasons why they might be useful. The following examples are not intended to prove that multiple spanning trees must be used; a single spanning tree is certainly the right choice for many implementations and/or particular installations of 802.1Q. They do, however, provide very good reasons why multiple spanning trees should be supported by 802.1Q. 3.1 Simple Load Sharing Figure 2 illustrates a simple case where multiple spanning trees provide relief from a common difficulty. Two switches A and B are connected via two physical links 1 and 2. Two VLANs, "red" and "blue", are configured to be carried on both physical links. FIGURE 2. Simple Load Sharing (PostScript only) If one single spanning tree is used, then one of the physical links must be unused for data traffic; it carries only 802.1D BPDUs. It serves as a hot standby for the other physical link. If two spanning trees are used, the spanning trees for the red and blue VLANs can be configured so that one VLAN uses one physical link, and the other VLAN uses the second physical link. If either physical link fails, spanning tree will cause the other link will carry both VLANs. There certainly exist other means of solving this problem. For example, an inverse multiplexing protocol could be used to meld links 1 and 2 into a single logical link, and share the traffic load between the two physical links. Presumably, a load sharing algorithm based on some principle other than VLANs could be more equitable than the multiple spanning tree approach. However, there is no standard algorithm for this purpose, and none has been proposed for 802.1Q version 1. Should an inverse multiplexing protocol be adopted, it would serve the multiple spanning tree model as well as any other. In the meantime, multiple spanning trees can provide one solution to this problem. 3.2 Multi-path Load Sharing Figure 3 illustrates a type of load-sharing problem that an inverse multiplexing protocol cannot solve. With three switches and links and three VLANs, each physical link can carry two of the three VLANs. While this a simplistic example, one can imagine the utility of this mechanism in more complex networks, especially in cases where many VLANs have a presence only in certain parts of the network. FIGURE 3. Multi-path Load Sharing (PostScript only) 3.3 Path Optimization In Figure 4, we see two switches connected to each other, via link 1, and to other switches in a VLAN cloud. Connected to both switches is a single LAN segment, link 2, to which is attached file server F. Let us suppose that the VLAN to which this file server is connected is used primarily to connect that file server to numerous clients behind both of the two switches. With one spanning tree, the loop {switch A, link 1, switch B, link 2} must be broken by cutting one of the switches' ports to link 2, let us say, B's. Traffic from the file server to clients behind switch B must first flow through switch A and link 1 before reaching switch B. (The alternative, cutting the backbone between A and B, would usually be even worse.) FIGURE 4. Path Optimization (PostScript only) If link 2 is a 10 Mb Ethernet and link 1 a 1 Gb Ethernet, this is not a serious problem. If both are 100 Mbit Ethernets in computer room, the problem becomes more important. 3.4 Extending VLANs to Non-Trunk-Capable Switches There are at least three reasons why an existing switch might be able to participate in VLANs with a software upgrade, but not be able to support an 802.1Q trunk: 1. The switch might require a hardware modification to its ports to support 802.1Q frame formats. 2. The switch might be incapable of supporting multiple VLANs on one physical port at all. 3. The switch might be incapable of supporting "baby giant" packets made longer than the physical link maximum by the length of the 802.1Q tag, in an installation where it is impractical to restrict the packet sizes sent by endstations. In Figure 5, we see an example of a switch B connected by two simple 100 Mb Ethernet links to an 802.1Q-capable switch A that is part of a VLAN cloud. Switch B can, with a software upgrade, support a number of VLANs equal to the number of physical connections it has to switch A, but it cannot support an 802.1Q-tagged link to A. In this case, if a single spanning tree is used, then one of the two VLANs "red" or "blue" must be disconnected to break the loop {switch A, blue link, switch B, red link}. If the red and blue VLANs are on separate spanning trees, then both can operate and be part of the VLAN configuration. FIGURE 5. Extending VLANs to Non-Trunk-Capable Switches (PostScript only) 3.5 Accidental Interruption of Backbone Connectivity In Figure 6, we see two 802.1Q switches A and B both connected to a VLAN cloud. Suppose two untagged ports, one from each switch, are connected by an 802.1D bridge X that is ignorant of 802.1Q. One would hope that the loop in the blue VLAN would be broken where it presumably should be broken, at one of the switches' ports to that blue VLAN, or perhaps at one of bridge X's ports. However, depending on the various 802.1D adjustable parameters programmed into the switches and bridge X, the loop might be broken on the VLAN cloud side of the bridges. FIGURE 6. Accidental Interruption of Backbone Connectivity (PostScript only) In the two examples in Figure 6, we see the result of an unfortunate blockage. If the blue VLAN shares a spanning tree with all other VLANs, the backbone is broken, and all VLANs partitioned. If the blue VLAN is on a separate spanning tree, then the worst that can happen is that all blue traffic is funneled through bridge X. The path through the blue VLAN may be sub-optimal, but connectivity is maintained. 3.6 Security versus Connectivity For any number of reasons, it may be necessary to disallow a particular VLAN from flowing through a certain switch or across a certain physical link. Reasons might include: 1. Security: One might not want to pass the Human Resources net through Engineering. 2. Volatility: One might not want to pass production networks through a development laboratory, though connectivity to the laboratory is necessary. As a trivial example, Figure 7 shows two switches, A and B, connected by two physical links 1 and 2. The red VLAN is restricted to physical link 1, and the blue to physical link 2. If one spanning tree is employed, then one of the two VLANs must be disconnected. If the two VLANs are in different spanning trees, both may remain connected. FIGURE 7. Security versus Connectivity (PostScript only) 3.7 Isolation from Spanning Tree Reconfiguration Whenever the gain or loss of a link, port, or bridge causes a change in the configuration of a spanning tree, some disruption of service is possible. In the case of the loss of a component, some interruption is extremely likely. If all switches and all VLANs are running one spanning tree, any change to any part of the spanning tree affects all VLANs. Multiple spanning trees improve this situation in two ways: 1. One may isolate certain VLANs and their spanning tree(s) to one part of the network. Changes in that spanning tree have no effect whatever on other parts of the network. 2. Having multiple spanning trees even in common areas of the network make it less likely that any given physical backbone link carries the VLANs of any given spanning tree, and thus less likely that a failure will affect any given VLAN. This is a corollary of Section 3.6, "Security versus Connectivity". The utility of this separation is particularly useful if one considers the reasons for security listed in Section 3.6. Isolation of the production VLANs from the laboratory VLANs is clearly desirable. 4.0 Rules for interaction between 802.1D and 802.1Q The specific rules required to coordinate 802.1D and 802.1Q in order to allow (but not require) multiple spanning trees are: 1. A Spanning Tree Group (STG) comprises one or more VLANs which share the same instance of the Spanning Tree Protocol (STP) of 802.1D. 2. A separate instance of the 802.1D STP runs in each switch for each STG enabled in that switch. 3. If a given physical switch port P is enabled for carrying traffic for any VLAN belonging to STG S, (and if STP is enabled for STG S on port P,) then a BPDU transmitted for STG S on port P may be transmitted exactly once on some one of the STG S VLANs enabled for port P. (The robustness of the model can be improved by constraining all BPDUs for STG S to be sent on one particular VLAN belonging to STG S on port P.) 4. If a given physical switch port P is enabled for carrying traffic for any VLAN belonging to STG S, (and if STP is enabled for STG S on port P,) then the switch must be able to receive BPDUs for STG S on any of the STG S VLANs enabled for port P. 5. A given physical switch port P has one 802.1D blocking state (blocked, learning, etc.) per STG (STP instance) enabled on that port. That state applies to all of the VLANs in the STG, but affects no VLANs in any other STG. It is helpful to list some of the corollaries of these rules that can be easily derived: 6. A typical endstation port might carry exactly one VLAN, with no 802.1Q tagging. If STP is run on this port, it carries one BPDU per hello time from the STP instance associated with its STG. 7. If all of the VLANs on a "trunk port" carrying multiple VLANs are in the same STG, then that trunk carries only one BPDU per hello time. 8. If the VLANs on a "trunk port" belong to more than one STG, then there will be one BPDU passing through that port per STG per hello time. 9. A switch's total BPDU load is thus not the number of STGs times the number of physical ports. It is, typically, the same as the single spanning tree case, increased by the number of extra STGs times the number of "trunk ports". 5.0 Continuum Between ST per Network and ST per VLAN Given the rules of Section 4.0, one may configure a network with one spanning tree or many spanning trees. If a network with 100 VLANs is configured with one STG for all 100 VLANs, it is employing the single spanning tree model. If it is configured with 100 STGs, one for each VLAN, then it is at the opposite pole. In-between configuration, with 10 VLANs in each of 10 STGs, or one STG with 90 VLANs and 5 STGs with 2 VLANs each, are equally possible. Any trade-off between flexibility and management load is possible. We attempt to illustrate the compatibility of the rules in Section 4.0 with some common current assumptions about how BPDUs should be carried between 802.1Q switches: 5.1 Physical BPDUs and Logical VLANs One common view of the relationship between 802.1D and 802.1Q is that BPDUs and the STP apply to the physical link. STP establishes a loop-free topology of physical links over which 802.1Q-tagged VLAN frames are carried. To map this view to definitions and rules in Section 4.0, we simply establish one extra VLAN, an "STP VLAN". There are no ports assigned exclusively to the STP VLAN. No data frames are carried on the STP VLAN. The STP VLAN is never tagged on any port. All other VLANs are grouped with the STP VLAN into one STG, the only STG configured. On a "trunk port", the "normal" VLANs are tagged with 802.1Q tags (or not, if implicit tagging is used). The BPDUs, and only BPDUs, are sent on the STP VLAN. (This is the reason for the last restriction in point 3. of Section 4.0.) Since this VLAN is not tagged, the BPDUs traverse the port in native mode. If STP blocks the port, all VLAN traffic except the BPDUs stops moving through the port. On an "endstation port" carrying only one untagged VLAN V, if STP is enabled, the BPDUs are sent on VLAN V. Since VLAN V is untagged on this port, there is no difference between transmitting the BPDU on VLAN V or the STP VLAN. 5.2 Spanning Tree per VLAN At the opposite extreme, if each VLAN is the only member of its own STG, every VLAN carries its own BPDUs. On any port (such as a "trunk port") on which the VLAN is tagged, the BPDU is tagged. On any port where the VLAN has no tag, the BPDU is untagged. If two implicitly-tagged VLANs (with no 802.1Q tag or equivalent proprietary or semi-standard tag) share the same wire, and both are different STGs, then both BPDUs are transmitted and received on that port. (See Section 6.2, "Consequences of Misconfiguration".) 5.3 Intermediate Configurations Each of the subheadings in Section 3.0 may be viewed as an advantage of using multiple spanning trees, or as a problem with using a single spanning tree. Between VLANs in separate STGs, all of the advantages are realized. Within the VLANs composing a single STG, all of the corresponding problems are realized. With configurations intermediate between the poles, a system administrator can balance needs against cost. Note that this model allows interoperability between switches that assume the one spanning tree model and switches that support multiple spanning trees. VLANs which traverse switches supporting only one spanning tree must be bundled into the same STG. (This is not to suggest that such mixtures are a good idea; but to illustrate the flexibility of the model.) 6.0 Costs, Problems, and Rev 2 Work Items The direct costs of this proposal are minimal. The multiple spanning tree model does generate some long-term desires for additional requirements on any forthcoming VLAN coordination protocol. Although the model does not generate any new problems, it does aggravate certain known problems of spanning trees. 6.1 Tagged BPDUs Clearly, the tagging a BPDU with an 802.1Q tag must be allowed for this proposal to work. 6.2 Consequences of Misconfiguration One may ask what happens if two switches have a different idea about which VLANs belong to which STG, or even how many STGs exist. Clearly, the perceived benefits of the multiple spanning tree model cannot be realized in the face of such misconfigurations. Fortunately, the robust nature of the 802.1D STP mitigates the severity of the problems generated by misconfiguration. Differences in the configured relationships between VLANs and STGs result in the same kinds of errors as occur when two spanning trees are linked with a wire; the two spanning trees are merged. Similarly, when VLAN/STG relationships differ among switches, one spanning tree is formed over the union of all VLANs linked together into the same STG by any switch. In fairness, this concatenation is not without risk: the resultant combined spanning tree could exceed the maximum STP diameter. 6.3 Non-VLAN-Aware Switches If non-VLAN-aware switches are mixed with VLAN-aware switches on untagged links, multiple parallel spanning trees introduce no new problems. If a non-VLAN-aware switch is placed on a trunk carrying tagged BPDUs, there will be a problem. The BPDUs will be stopped, because their destination MAC addresses will be recognized by the non-VLAN-aware switch. They will not, however, be interpreted properly. The data packets, however, will be passed by the switch. This problem can be avoided by using the technique of Section 5.1, "Physical BPDUs and Logical VLANs" in sections of the network in which non-VLAN-aware switches are used. If tagged BPDUs are accidentally directed (by mis-configuration or mis-wiring) to a non-VLAN-aware switch, then the VLAN-aware switches will be made aware of the problem when they receive the untagged BPDUs. 6.4 Ensuring Consistent VLAN/STG Relationships There are better ways to ensure that each switch has the same idea about which VLANs belong together in which spanning tree group than to trust that each switch is configured the same way. Two of them are: 1. Arbitrary association. Allow an arbitrary association of VLAN with STG. That is, the definition of an STG is a list of VLAN-IDs that may contain anything from one to all of the VLANs, in any combination. To ensure consistency, this approach requires that the VLAN/STG associations be distributed among the switches. The "distribution protocol(s)" assumed by 802.1Q/D2 should be sufficient to distribute this information. 2. Subdivide the VLAN-ID field. We can break the VLAN-ID into two pieces, one of which identifies the STG, and one the VLAN within the STG. For example, a 12-bit VLAN-ID field could be split into a 4-bit STG-ID and an 8-bit sub-VLAN-ID within the STG. (Presumably, two VLANs with the same sub-VLAN-ID in two different STGs would be considered as two different VLANs.) Some ports, especially "trunk ports", may carry more than one STG, and said STGs may be in different states on that port. Both of these approaches therefore require that a switch have the ability to use the VLAN/STG association information to condition the treatment of an incoming packet by its VLAN-ID, and not merely by the bridge port on which it was received. The subdivided VLAN-ID field may well be easier to implement in hardware than the arbitrary association. It reduces the size of the VLAN-ID-to-STP-state translation table to the number of STGs accommodated by the division (16 in the above example) instead of the number of VLAN supported (likely either 4k or 32k). The method of subdividing the VLAN-ID is relatively inflexible. (If the dividing line between the two parts of the VLAN-ID can vary, then some means of ensuring that all switches agree on the division is required, and much of the attraction of a sub-divided VLAN-ID is lost.) It may be possible to find a division of the VLAN-ID that satisfies enough requirements to set it in the standard. Clearly, this use of VLAN-IDs cannot be discussed in isolation from the question of the scope of VLAN-IDs. Whether VLAN-IDs are global to a network, local to an administrative domain within a network, local to a LAN segment, or local to a set of ports on a LAN segment, must be related to other definitions of VLAN-ID formats. 7.0 Summary It is not the primary purpose of this proposal to claim that the multiple spanning tree model is invariably "better" than the single spanning tree model. The management load incurred by a configuration with 4000 VLANs running on 4000 separate spanning trees, for example, would be difficult to justify. The essence of this proposal is that: 1. by defining the appropriate relationship between 802.1D and 802.1Q, any configuration on the continuum from one spanning tree per network to one spanning tree per VLAN is possible, and 2. that this flexibility incurs no significant cost to those who choose tailor their implementations to one or the other pole of the continuum. 8.0 Acknowledgments Many thanks to Paul Frantz, Richard Hausman, Keith McCloghrie, and Luc Pariseau for their comments and suggestions made in response to earlier drafts of this contribution.