Designing a Fault-Tolerant Network

This document provides guidelines for designing a fault-tolerant network using Netra CP3140 switches and Netra CP3240 switches.

This document contains the following topics:


ATCA Network Overview

ATCA is designed to be used in highly available (HA) environments. Many HA principles such as redundancy and fault tolerance are designed into ATCA specification. ATCA systems need to be connected to external networks in such a manner that the HA principles applied inside the shelf are also applied to external networks. A system is only as available as its connections to end users. This document outlines some of the techniques to maximize uptime of an ATCA system that can be applied when using Sun’s Netra switch.

The following shows a simplified partial-network diagram of an ATCA network.

FIGURE 1 ATCA Network Diagram (Partial)


Illustration of a simplified ATCA network diagram.


Design Principles and Guidelines

This section covers fault-tolerant design principles and guidelines.

Basic

Independent of the software used to increase availability, a system should be redundantly cabled, preferably at both the board level and the link level.

ATCA uses a dual-star topology for backplane connections. So, inside the ATCA shelf every node (anything that is a network endpoint) is connected to both switch blades. (For shelf managers, this statement is true only for PICMG 3.0 R2.0 ECN-002 “cross-connect” enabled shelves).

Smart End Points: Channel Bonding

The simplest and most powerful way to implement an HA network is by using channel bonding drivers. Because every ATCA node is connected to two hub blades, the node has at least two network interfaces. While these interfaces can be treated as separate interfaces logically, it makes sense to treat them as a single interface. Channel bonding drivers provide that abstraction. Higher level software uses a single virtual-network interface, and the channel bonding driver handles the complex choice of which physical-network interface to use.

Channel bonding drivers choose which port to use at what time, through various decision algorithms. Most of these algorithms put bandwidth at a higher priority than availability. However, the active-standby algorithm is specifically designed for HA, and therefore it is the best algorithm to use in ATCA networks.

The true power of a channel bonding driver is not in its decision algorithm for choosing ports, but rather its decision algorithm that defines whether a port is usable. Only two common algorithms define if a port is usable:

One of the great features of channel bonding drivers is that they are generally topology, layer, and protocol independent (decision algorithms are not topology independent, but the overall channel bonding driver is). They work well in complex networks, simple networks, Layer 2 switched networks, and Layer 3 routed networks.

Layer 2 Methods: VLANs and MSTP

Layer 2 switching is used often because it is simple. The simplicity of switching makes it fast and inexpensive, but it puts limits on the design of the network.

A Layer 2 network, without additional protocols, must follow a strict tree structure. No loops are allowed in the network. This limitation is in direct contrast to the key HA principle of redundancy. However, using the VLAN and MSTP protocols on top of a Layer 2 switched network resolves this limitation and allows loops.

With these rules and tags, traffic can be limited and controlled so that the network loop does not exist on any single VLAN.

STP (spanning tree protocol) is a protocol that is designed specifically to deal with network loops. STP traverses a network, finds, all loops, and disables the links that created loops. It effectively makes the network graph into a tree graph (no node connect to any other node by more than one link), hence the name.

MSTP (multiple spanning tree protocol) is a protocol designed to improve STP. MSTP is better than STP in two ways. First, MSTP is VLAN aware. Regular STP ignores VLAN settings and thus, even if a network loop has been properly segmented with VLAN, STP will disable the loop link. MSTP can recognize that while the loop exists, it has been contained by VLAN settings and does not disable the loop link. Secondly, MSTP can converge (reconfigure) more quickly than STP when the network changes. Of the two redundant links in any network loop, one will be active and the other will be inactive. When the active link fails, it is desired to switch to the inactive link as quickly as possible.

The following shows an example Layer 2 network configured with VLAN and MSTP.

FIGURE 2 Layer 2 Network Configured With VLAN and MSTP


Illustration of an example Layer 2 network configured with VLAN and MSTP.

In this network, we have three node blades that do not need to communicate with the external network (RED) and three that do (BLUE). One of the BLUE nodes needs to communicate with the RED nodes. VLANs are used to logically separate the RED and BLUE nodes. MSTP is used to prevent a blocking of the inter-switch link due to a loop in VLAN BLUE from affecting VLAN RED.

Layer 2 VLAN Configuration

The following code example shows how to configure the example network shown in FIGURE 2.


CODE EXAMPLE 1 Configuring a Layer 2 VLAN Network With MSTP
vlan database
vlan  101
vlan  202
exit
 
configure
 
!interswitch link needs to be in both VLANs
interface  0/2
!The port cost of MST 2 is set here are lower then normal
!to signify using this port is preferential over others
spanning-tree mst 2 cost 1800
vlan participation exclude 1
vlan participation include 101
vlan tagging 101
vlan participation include 202
vlan tagging 202
exit
 
!node blade A is in 202
interface  0/3
vlan participation exclude 1
vlan participation include 202
vlan pvid 202
exit
 
!node blade B is in 202
interface  0/4
vlan participation exclude 1
vlan participation include 202
vlan pvid 202
exit
 
!node blade C is in 202 and 101
interface  0/5
vlan participation exclude 1
vlan participation include 202
vlan pvid 202
vlan participation include 101
vlan tagging 101
exit
 
!node blade D is in 101
interface  0/6
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!node blade E is in 101
interface  0/7
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!node blade F is in 101
interface  0/8
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!external network
interface 0/20
vlan participation exclude 1
vlan participation include 202
vlan pvid 202
exit
 
!MSTP is enabled such that 101 and 202 are in different
!MSTP instances
spanning-tree
spanning-tree configuration name "MSTPexample"
spanning-tree configuration revision 0
spanning-tree mst instance 1
spanning-tree mst vlan 1 1
spanning-tree mst vlan 1 101
spanning-tree mst instance 2
spanning-tree mst vlan 2 202
 
exit

Layer 3 Methods: VRRP and VRRP Tracking

VRRP (virtual router redundancy protocol) lets multiple routers appear as a single virtual router. With VRRP nodes on the Layer 2 side of the hub, blades can be configured with the single virtual IP and network elements on the Layer 3 side, to see two routes to the same subnet. One of the VRRP routers becomes the master of the virtual IP and does the routing. If it fails, the backup router takes over.

Traditional VRRP defines a router failure as a series of missing checkpoint packets sent between VRRP routers. VRRP tracking adds other fail-over conditions. With VRRP tracking, multiple “tracks” can be set up to monitor the status of a link, a local route, or a remote IP. If any track fails, the router forces a fail-over immediately, without waiting for the checkpoint packets.

Layer 3 routing is more complex than Layer 2 switching but is more robust. Loops are expected in Layer 3 networks. In ATCA, hub blades are used as gateway routers. A hub blade connects a Layer 3 external network to a Layer 2 internal network. There are two gateways for the two hub blades in the ATCA shelf. The node blades could be configured to handle two separate gateways; however, the VRRP protocol provides a more elegant solution.

The following shows an example VRRP network configured with VRRP tracking.

FIGURE 3 Layer 3 Network Configured with VRRP


Illustration of a Layer 3 network configured with VRRP and VRRP tracking.

In this example, the two switch blades are redundant gateways between the node blades and the external network. The node blades are in the 12.55.67.x subnet, whereas the external network is in the 22.50.1.x subnet. An instance of VRRP is configured so that if either switch blade fails, the node can still reach the external network. VRRP tracks are added to provide more robust failover conditions.

Layer 3 VRRP Configuration

The following code example shows how to configure the example network shown in FIGURE 3.


CODE EXAMPLE 2 Configuring a Layer 3 Network With VRRP
vlan database
vlan  101
vlan routing 101
exit
 
configure
ip routing
ip vrrp
 
!interswitch link needs to be in both VLANs
interface  0/2
vlan participation exclude 1
vlan participation include 101
vlan tagging 101
exit
 
!node blade A is in 101
interface  0/3
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!node blade B is in 101
interface  0/4
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!node blade C is in 101
interface  0/5
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!node blade D is in 101
interface  0/6
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!node blade E is in 101
interface  0/7
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!node blade F is in 101
interface  0/8
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!external network
interface 0/20
ip address  22.50.1.5  255.255.255.0
exit
 
interface  4/1
!the other switch is set to 12.55.67.19
ip address  12.55.67.18  255.255.255.0
ip vrrp 102
ip vrrp 102 ip 12.55.67.20
!the other switch is set to 253
ip vrrp 102 priority 250
ip vrrp 102 mode
exit
!
!track remote server
track 1  ip route 22.50.1.15/24 reachability
 
!track local link state If any link goes down, failover
track 2 interface 0/3 line-protocol
track 3 interface 0/4 line-protocol
track 4 interface 0/5 line-protocol
track 5 interface 0/6 line-protocol
track 6 interface 0/7 line-protocol
track 7 interface 0/8 line-protocol
track 8 interface 0/20 line-protocol
 
!track local route
track 9  interface 4/1 ip routing
 
!Assign values to each track and assign the tracks to the vrrp instance
vrrp 102 track 1 decrement 40
vrrp 102 track 2 decrement 40
vrrp 102 track 3 decrement 40
vrrp 102 track 5 decrement 40
vrrp 102 track 6 decrement 40
vrrp 102 track 7 decrement 40
vrrp 102 track 8 decrement 40
vrrp 102 track 9 decrement 40
 
exit

Best Practices

Each of the fault-tolerant network design methods presented (channel bonding drivers, Layer 2 methods, and Layer 3 methods) are best used together to achieve maximum availability.

The following shows an example of all methods combined into a single network configuration.

FIGURE 4 Fault-Tolerant Network Combining All Design Methods


Illustration of a fault-tolerant network with all design methods integrated.

All Methods Integrated Configuration

In the following example, channel bonding, VLANs with MSTP, VRRP, and VRRP tracking are integrated.


CODE EXAMPLE 3 Configuring a Network With All Design Methods Integrated
vlan database
vlan  101
vlan routing 101
vlan 202
vlan routing 202
exit
 
configure
ip routing
ip vrrp
 
!interswitch link needs to be in both VLANs
interface  0/2
!The port cost of MST 2 is set here are lower then normal
!to signify using this port is preferential over others
spanning-tree mst 2 cost 1800
vlan participation exclude 1
vlan participation include 101
vlan tagging 101
exit
 
!node blade A is in 202
interface  0/3
vlan participation exclude 1
vlan participation include 202
vlan pvid 202
exit
 
!node blade B is in 202
interface  0/4
vlan participation exclude 1
vlan participation include 202
vlan pvid 202
exit
 
!node blade C is in 202 and 101
interface  0/5
vlan participation exclude 1
vlan participation include 202
vlan pvid 202
vlan participation include 101
vlan tagging 101
exit
 
!node blade D is in 101
interface  0/6
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!node blade E is in 101
interface  0/7
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!node blade F is in 101
interface  0/8
vlan participation exclude 1
vlan participation include 101
vlan pvid 101
exit
 
!external network
interface 0/20
vlan participation exclude 1
vlan participation include 202
vlan pvid 202
exit
 
interface  4/1
!the other switch is set to 12.55.67.19
ip address  12.55.67.18  255.255.255.0
ip vrrp 102
ip vrrp 102 ip 12.55.67.20
!the other switch is set to 253
ip vrrp 102 priority 250
ip vrrp 102 mode
exit
 
interface  4/2
!the other switch is set to 22.50.1.4
ip address  22.50.1.3  255.255.255.0
ip vrrp 202
ip vrrp 202 ip 22.50.1.5
!the other switch is set to 253
ip vrrp 202 priority 250
ip vrrp 202 mode
exit
 
!
!track remote server
track 1  ip route 22.50.1.15/24 reachability
 
!track local link state If any link goes down, failover
track 2 interface 0/3 line-protocol
track 3 interface 0/4 line-protocol
track 4 interface 0/5 line-protocol
track 5 interface 0/6 line-protocol
track 6 interface 0/7 line-protocol
track 7 interface 0/8 line-protocol
track 8 interface 0/20 line-protocol
 
!track local route
track 9  interface 4/1 ip routing
track 10  interface 4/2 ip routing
 
!Assign values to each track and assign the tracks to the vrrp instance
vrrp 102 track 1 decrement 40
vrrp 102 track 2 decrement 40
vrrp 102 track 3 decrement 40
vrrp 102 track 5 decrement 40
vrrp 102 track 6 decrement 40
vrrp 102 track 7 decrement 40
vrrp 102 track 8 decrement 40
vrrp 102 track 9 decrement 40
vrrp 102 track 10 decrement 40
vrrp 202 track 1 decrement 40
vrrp 202 track 2 decrement 40
vrrp 202 track 3 decrement 40
vrrp 202 track 5 decrement 40
vrrp 202 track 6 decrement 40
vrrp 202 track 7 decrement 40
vrrp 202 track 8 decrement 40
vrrp 202 track 9 decrement 40
vrrp 202 track 10 decrement 40
 
!MSTP is enabled such that 101 and 202 are in different
!MSTP instances
spanning-tree
spanning-tree configuration name "MSTPexample"
spanning-tree configuration revision 0
spanning-tree mst instance 1
spanning-tree mst vlan 1 1
spanning-tree mst vlan 1 101
spanning-tree mst instance 2
spanning-tree mst vlan 2 202
 
exit


Related Documentation

The following table lists related documentation. The online documentation is available at:

http://docs.sun.com/app/docs/prod/cp3240.switch?l=en#hic
http://docs.sun.com/app/docs/prod/n900.srvr#hic

Application

Title

Part Number

Format

Location

Latest information

Sun Netra CP32x0 Product Notes

820-3260-xx

PDF

Online

Pointer doc

Sun Netra CP3240 Switch Getting Started Guide

820-3254-xx

Printed

Shipping Kit

Usage

Sun Netra CP3240 Switch User’s Guide

820-3252-xx

PDF

Online

Reference

Sun Netra CP3240 Switch Software Reference Manual

820-3253-xx

PDF

Online

Safety

Sun Netra CP3x20 Switch Safety and Compliance Manual

820-3505

PDF

Online

Latest information

Netra CT 900 Server Product Notes

819-1180-xx

PDF

Online

Pointer Doc

Netra CT 900 Server Getting Started Guide

819-1173-xx

Printed

Shipping kit

Overview

Netra CT 900 Server Overview

819-1174-xx

PDF

Online

Installation

Netra CT 900 Server Installation Guide

819-1175-xx

PDF

Online

Administration

Netra CT 900 Server Administration and Reference Manual

819-1177-xx

PDF

Online

Reference

Netra CP3140 Switch Software Reference Manual

819-3774-xx

PDF

Online




Note - Documentation for the Netra CP3140 switch is within the Netra CT900 server documentation, as listed in the table.