Chapter 9. Token Ring and Switched Environments

The Token Ring architecture was developed in the early 1980s. Throughout the early phases of development, many companies participated in the evaluation of different design and circuitry possibilities for a LAN based on the deterministic operation. The Token Ring topology was created and released by IBM and Texas Instruments, although other companies, such as Madge, were also directly involved in the final evolution of the design.

In 1985, the Institute of Electrical and Electronic Engineers (IEEE) released the 802.5 design specification for the Token Ring LAN. The Token Ring topology is designed as a physical star, but is called a logical ring. Various devices connect across an electrical ring that interconnects through a device called a Multi-Station Access Unit (MSAU). The network devices connecting to the MSAUs are connected through lobe cables.

As mentioned earlier, the actual layout of the Token Ring LAN design configuration was based on a physical star. At the time of introduction, the Token Ring network was contrary to the initial introduction of the Ethernet topology, which was introduced in a bus layout configuration.

The Token Ring network configuration also has an extremely inherent fault-redundant operation. The fault redundancy is built in to the physical wiring scheme and the NIC-to-NIC LAN operations. The Token Ring network is considered fault-redundant and more of a stable LAN topology. It is a fact that this fault redundancy does introduce a small amount of overhead applied to the overall data traffic load. If a Token Ring LAN is operating in a normal and stable mode, however, the fault-redundancy circuitry and associated operations usually are not active, and the physical traffic overhead is minimal and causes little impact as to utilization load.

On a Token Ring network, data is transmitted in a baseband transmission mode, similar to Ethernet. One of the main differences is that the access method for Token Ring is based on a deterministic method, as compared to the contention-based method CSMA/CD process utilized in the Ethernet LAN environment.

The original Token Ring architecture was introduced to the industry in 1985 at a 4Mbps data rate. In 1989, 16Mbps Token Ring was introduced. In the mid-1990s data rates of 32Mbps and even higher (specifically, 100Mbps) were introduced for industry implementation. However, these higher Token Ring data rates were not well received. This was because by the time they were available for Token Ring, there were already more advanced data rates in operation with other network topologies. These topologies included Fast 100Mbps Ethernet, 100Mbps FDDI, and 155Mbps and 622Mbps ATM. There were also high-cost issues associated with implementing the higher data rates in Token Ring topologies and in upgrading existing Token Ring LANs.

Token Ring Design and Layout

The Token Ring design physical attributes are based on a star layout. The Token Ring network design is based on a physical communication medium entity. The actual upper-layer protocols and application data are packaged within Token Ring–based specification frames that flow on top on the Token Ring physical network, and thus create a final networking architecture. Certain components are standard as originally introduced in the Token Ring design, and are still present in the more advanced Token Ring architecture products available today. To introduce the main concepts of the Token Ring, I will use the initial design specifications to explain the physical layout.

The Token Ring network architecture is based on an operation that occurs on the Token Ring NIC card. Essentially, the Token Ring NIC is the network. The Token Ring NIC contains a complex set of circuits that operate together and are known as the Token Ring agent. The Token Ring agent chipset was designed to interpret and process data for inbound and outbound transmission. The Token Ring NIC agent processes all Token Ring packets and separates the actual Protocol Data Unit(PDU) from the Token Ring frame up to the upper-layer protocol unit chain processing of a workstation or a server connected in a Token Ring LAN environment. For purposes of technology terms, note that a Token Ring station is also referred to as a ring station. In summary, the Token Ring NIC agent performs all the handling and interpretation of data communicated inbound to and outbound from the Token Ring NIC.

To describe the Token Ring LAN connection scheme, the Token Ring NIC card is a ring station and is interconnected to a Token Ring hub cabling scheme through a cable run called the lobe cable. As mentioned earlier, the Token Ring hub scheme is based on a centralized hub scheme based on an MSAU. The initial design of the MSAU was released in an IBM product called the IBM 8228. The IBM 8228 MSAU hub was built on an 8-port device port design. This was because in 1985, the internetwork community was still progressing at a fairly slow pace. Many LANs required only 8 to 20 devices for interconnection. The initial MSAU had an 8-port design that allowed for 8 devices to be connected.

For example, this specifically meant that a site could have seven workstations and one file server could access one MSAU. Inside the MSAU was an internal circuitry path called the main ring path (MRP). This was a path that created an electrical ring inside the actual MSAU. The cabling of each particular port had a direct run out to the specific ring station. The cable, called the lobe cable, was the link between the ring station and the centralized MSAU wiring scheme (see Figure 9.1).

A Token Ring MSAU.

Figure 9.1. A Token Ring MSAU.

If for some reason the Token Ring centralized MSAU wiring scheme required more devices to be connected to the Token Ring network because of expansion needs, there was a Ring-In (RI) and Ring-Out (RO) port designed for cabling links into additional IBM 8228 MSAUs. The RI and RO port allowed for expansion of the Token Ring MSAU centralized wiring scheme through a very simple process. The Ring-Out port of one MSAU could connect to the Ring-In port of another MSAU, and thus extend the MRP of Token Ring LAN. When this process is used, however, the RO port of the second MSAU should link back to the RI port of the initial first MSAU in the MRP link, and thus create a final main MRP complete loop or electrical ring. The last MSAU RO MRP cable link to the first MSAU RI is required because the Token Ring cabling scheme is built on a four-wire inherent cabling path that creates an MRP that can utilize a backup path. The MRP uses only two wires of the MRP, called the primary MRP path; an additional two wires in the MRP are called the backup secondary data path. The backup MRP cabling link is not invoked unless a fault redundancy ring is required. The fault redundancy of the cabling process is further described later in this chapter.

As noted earlier, the actual main ring path specification allows for a physical ring and an electrical ring to be in place from an overall design standpoint. The devices connected to the network through the lobe cables communicate with each MSAU port, and all the data seen on one port is also replicated through the MSAU and the complete MRP to other ports on the Token Ring network.

The Token Ring network is definitely a shared medium based on baseband transmission. When the Token Ring network physical wiring scheme was initially designed, one of the first concepts was to introduce a centralized wiring scheme from a positioning point within a building infrastructure. This was to fulfill the requirement to allow for a troubleshooting process that would be simple if an issue occurred on the physical medium. Because of this, all the hubs or MSAUs were required for placement within a patch panel. The patch panel would then be placed within the wiring closet. This layout was normally a common process for actual implementation design as based on the implementation of the Token Ring network. Most of the Token Ring networks that were implemented between 1985 and 1990 were introduced through the IBM product sales network. Because of this, many of the specifications required for layout of the physical Token Ring were implemented through procedures developed by IBM engineering teams and through associated IBM documentation. Some of this documentation is noted in Appendix B, "Reference Material."

With all the MSAUs being physically placed within a patch panel and also being physically positioned within a wiring closet, this created a centralized wiring process within a building that was extremely structured from a cabling system standpoint. All the lobe cables would run out to the physical workstations or ring stations placed throughout a facility.

The lobe cables, when connected to a ring station, came back directly to the patch panel within the wiring closet. The patch panel would then have patch cables that would link down directly into the MSAU ports that would be utilized for device connection. Each one of the mounted MSAU hubs in the rack would then be interconnected to each other through the RI and RO ports in the Token Ring MSAU rack. Again, the last MSAU would link back to the initial MSAU to complete the ring. This is based on the four-wire inherent cabling fault-redundant path (see Figure 9.2).

A standard Token Ring patch panel.

Figure 9.2. A standard Token Ring patch panel.

The fault redundancy in the MRP cabling scheme comes into play when the MRP primary ring path loop is broken for any reason. This would include an MRP cabling break in the linked RO-to-RI cabling between MSAUs. This type of issue could be resolved by just removing the bad cable. The MSAU RI and RO ports of all the initially designed MSAUs included an inherent self-shorting data connector process that was built in to the loop design of the physical port of the MSAUs. The initial IBM 8228 MSAUs were simple in overall circuitry and based on mechanical relay process. Therefore, if a cable were to be damaged or if there was a bad cable between any of the MSAUs, it was possible for a technician to remove the bad cable. The RO and RI ports would automatically self-short and use the backup secondary path of the MSAU MRP cabling scheme (see Figure 9.3).

The fault redundancy built in to the Token Ring wiring scheme.

Figure 9.3. The fault redundancy built in to the Token Ring wiring scheme.

Throughout the industry, many companies implemented a Token Ring LAN and did not utilize the proper cabling for a true four-wire scheme or did not link the last MSAU in the MRP to the first MSAU for the backup path to be available in implementation. When this occurred, the removal of a bad cable would not allow the backup path circuitry to be invoked by the fault-redundancy cabling and RI and RO MSAU port operation. In this case, this would cause problems in the overall recovery of the physical Token Ring medium.

Overall, the Token Ring network is based on a physical star and a logical ring concept that is one of the most fault-redundant and unique LANs in the market (see Figure 9.4).

The Token Ring network is both a physical star and a logical ring.

Figure 9.4. The Token Ring network is both a physical star and a logical ring.

Understanding Token Ring Network Operation

Now that the Token Ring network cabling layout and structure have been discussed, the following is a discussion of the operation of the Token Ring network. The technical discussion that follows specifically describes how ring stations gain access to the Token Ring LAN deterministic medium along with how ring stations (RS) can communicate with other stations.

Because the Token Ring network was introduced in direct parallel to compete with the Ethernet LAN environment, certain features are built in to Token Ring and are considered enhancements to the Ethernet topology. This, of course, depends on the view of the industry analyst.

Note that the Token Ring LAN-structured centralized wiring scheme and fault redundancy are considered a plus in the local area networking environment. It is important to remember that the mainframe processing environment and mini-computer environment offered stability and a certain amount of guaranteed uptime as related to Mean Time To Repair (MTTR) and stabilization in the computing environment.

When the Token Ring LAN was initially introduced, LANs were considered unreliable in terms of stabilization and performance. Many MIS managers were concerned as to what the correct LAN topology should be for deployment. Many of the large banking and conservative institutions that utilized IBM host mainframes and mini-computers for processing business data moved forward with the Token Ring network for a LAN topology because of the inherent fault redundancy of the physical operation.

The following is a discussion of the physical Token Ring medium operations that are extensive as compared to Ethernet. Particular operations inherent to the design allow a Token Ring network to still offer a stable foundation for upper-layer protocol data transfer even when physical errors occur in communication. These features allow for a higher level of reliability and stability for upper-layer protocols to operate on. In other words, the Token Ring foundation was extremely stable as compared to the Ethernet topology upon initial release.

In today's environment, note that the Ethernet environment has advanced to where it can be completely managed and is considered stable. But again, this is because of the implementation of other technologies used in parallel with the Ethernet standards, such as Simple Network Management Protocol (SNMP) and RMON specifications, to manage the Ethernet internetwork technology.

The Token Ring topology, upon initial release, was without question considered to be the most fault-redundant and stable technology for a LAN type, and was widely implemented throughout the infrastructure.

The following section discusses the physical Token Ring architecture operations that allow the Token Ring network to function.

Token Passing Theory

In a Token Ring network design, a token circles around a physical electrical ring. The token is a 3-byte frame composed of a Starting Frame Delimiter field, an Access Control field, and an Ending Frame Delimiter field. This token frame circles around the ring on a continuous basis after the Token Ring network is active and operating. To activate a Token Ring network, two devices with inserted Token Ring NICs must be active on a physical Token Ring LAN. The initial device starting communication on the Token Ring will release the token. This process is described in detail later. After the token has actually been released, it circles the ring. Every device on the ring sees the token as it passes its NIC point on transfer. The normal timing cycle for a token to circle the ring is usually 10ms at a minimum.

If a device wants to transmit on the ring, this will be because an upper-layer protocol process invokes a request for communication. The upper-layer protocol process is engaged by the application or network operating system. This could be a workstation or a server that wants to communicate. The protocol chain processing in the upper-layer channels of the device cause a process to occur in which a Token Ring frame is composed with data and processed for request for transmit. At this point, the device wanting to transmit generates a vector command for its Token Ring NIC to copy and grab the token when it passes its node point to communicate on the ring. This allows for the start of access and data transfer on the physical medium. The device NIC then takes the token and appends the token to the actual Token Ring data frame for transmission. At this point, the Token Ring data frame will be sent out onto the Token Ring network MRP in an outbound transmission cycle. The Token Ring frame contains fields to assign a source and a destination station address.

The Token Ring frame will be interrogated by every Token Ring station as it passes through the ring. A certain interrogation process occurs, during which each Token Ring station's NIC interrogates a portion of the Token Ring frame—the starting delimiter, and then the frame control field, and next the associated destination address of the actual Token Ring frame.

If the Token Ring frame has certain valid attributes available in the frame control field (discussed later), and the Token Ring frame also has a destination address match with the Token Ring node investigating, the respective Token Ring node then sees that it is the respective destination station where the Token Ring frame is intended for as related to transmission (see Figure 9.5).

Token passing.

Figure 9.5. Token passing.

The Token Ring NIC in the destination device then processes the actual Token Ring frame received from the source device for extraction of the upper-layer protocol or physical data communication. Upon completing the processing of the frame, the destination Token Ring station releases the original Token Ring frame back out onto the ring for forwarding cycle retransmission back to the source station. When the source station receives the Token Ring frame, it then automatically reviews certain fields within the Token Ring frame, called the addressing and frame status fields. The access control field identifies whether the destination station received the frame, and whether the destination station recognized the frame, and copied the frame in for processing. After this final review of transmission cycle has been completed by the original source station, it then releases the token back out onto the ring for another station to grab the token to transmit.

In summary, the Token Ring architecture process of Token Ring passing theory is as follows:

A token is passed around the ring. A source station that wants to transmit grabs the token, appends data to the token, and creates a data frame. The data frame is transmitted from the source station to the destination station. The destination station then examines the frame, copies the frame in for processing, and then releases and forwards the frame back to the source station, which then investigates the frame to ensure that it was processed correctly. The frame is then stripped of all data, and a simple token is released back out on the ring so that another station can transmit—thus, the operation of a deterministic medium.

The access of any ring station, which can be any workstation or server on the ring, to communicate on a Token Ring network is determined by its access to the ring as based on and determined by the ring station's capability to gain access to network via the token.

It should now be clear that if there is extremely high utilization in place on a Token Ring network and because of the inherent deterministic operation, eventually the speed of data transfer may be affected.

Specifically, the speed at which each Token Ring device has the capability to receive a token and transmit may be slowed down by high utilization levels and load on a Token Ring network.

In the original design of a Token Ring network architecture, a 255+ node count design was allowed for implementation. In today's networking operating system environment, in conjunction with the high-loading applications being deployed, only much lower node counts can be utilized in a layout as capable of supporting higher data rates.

Dataflow Direction and NAUN Process

The Token Ring network operates on a process in which data is always transmitted in a downstream or clockwise fashion. This is called downstream Token Ring data transmission. Because transmission is always in a downstream mode, a device communicating on the ring in an upstream mode would be improper.

But there is a concept called the nearest active upstream neighbor (NAUN); this concept is very important for fault-redundancy troubleshooting and associated physical Token Ring management.

At any time when a device that is counterclockwise or upstream from a device is active or inserted into the MRP, it is referred to as the NAUN device.

If a device is one port directly above the Token Ring device downstream, and both devices are operating, for example, the device directly upstream would be considered the NAUN. If for some reason the device one port upstream from the particular device is turned off and is inactive, the next device upstream would be considered the NAUN device (see Figure 9.6).

Token Ring NAUN.

Figure 9.6. Token Ring NAUN.

The NAUN is an important concept for troubleshooting in fault-redundancy processes of the management cycle. This is due to an inherent operation called neighbor notification(explained later), during which each device on the Token Ring network maintains a register of its upstream neighbor address for troubleshooting processes. Therefore, the concept of a NAUN is very important for overall troubleshooting processes.

Token Ring Address Schemes

In the Token Ring network architecture, the addressing schemes are unique as compared to the Ethernet architecture and are more complex and offer more flexibility from an overall design standpoint.

The three types of addressing are as follows:

  • Individual

  • Group

  • Functional

Individual Addressing

Individual addressing refers to the individual address of the Token Ring NIC card. This is somewhat similar to the Ethernet addressing design, although local administration of the physical Token Ring address is possible.

The Token Ring individual physical address is based on a 6-byte addressing scheme, in which the leftmost 3 hexadecimal bytes represent the manufacturer code, and the 3 rightmost hexadecimal bytes represent the actual physical hexadecimal address.

In certain cases, in the individual addressing scheme of Token Ring, manu- facturers such as IBM or Madge may release a card to a company and just implement the NIC with the physical address assigned by the manufacturer, which is also verified with the IEEE. This is considered the universally assigned address.

Some companies required the ability to locally administer the address to create unique addressing schemes for management purposes. This requirement was addressed by the Token Ring specification, so manufacturers would allow customers purchasing a Token Ring NIC to assign a unique address that could compare to phone numbers or cubical numbers throughout company layouts. This was considered extremely flexible from an addressing system standpoint, but could easily pose a problem if not maintained properly. Specifically, if there is a duplicate address on a Token Ring network, a station cannot enter the ring. This can only be a problem with locally administered addresses. Because of this, many companies have chosen to use universal administration.

Group Addressing

Group addressing allows for a group to be defined on a Token Ring network.

A set of stations on the ring could be the local post office, for example; another set of stations on the ring could be the local legal office; and another set of stations on the ring could be the local medical office. A network operating system or an application could invoke a process where a broadcast to a specific group of devices was created.

Obviously, these would be technical group address assignments for a specific technical function in the application or the operating system. But the capability was there for group addressing to be assigned. In this case, a Token Ring station, if part of the group, would have a combined or parallel function running in which it would be assigned as part of the group. If this group addressing function were active, the station could then understand the group addressing destination address and process the frame for the cycle.

Functional Addressing

Functional addressing involves the functions that were built in to the initial Token Ring card design operation—that is, a function would be operating in a station that would be important. The functional assignment could be compared to a function such as legal or medical operations. In technical terms, there could be a specific function that the device is running parallel to its individual operation, such as a LAN manager, an active monitor, or an error monitor. These functions are considered standard in the Token Ring physical functional addressing system.

The main methodology of functional addressing was that the initial design vendors wanted to be able to manage the physical ring.

When certain devices were implemented on the ring, for example, they would be able to run combined or parallel management functions in direct operation with their normal individual function. In other words, a device could have a unique physical address that would be considered the individual address. The same device could also have an assigned functional address where the device would perform a function that would be supported by or support the rest of the ring. In this case, some standard functional addresses were common (see Table 9.1).

Table 9.1. Examples of Standard Devices with Their Corresponding Functional Addresses

Standard DeviceFunctional Address
Active monitorC00000000001
Ring error monitorC00000000008
Ring parameter serverC00000000002
Configuration support serverC00000000010
BridgeC00000000100
LAN managerC00000002000

All these addresses are common functional addresses. Many other functional addresses are reserved for future use. This is a common term used in IBM technology.

Functional addressing allowed manufacturers and NOS application manufacturers to develop functions on the ring for design where they could utilize unique management systems.

What is being introduced here is the capability for manufacturers of network operating systems and other key products in the industry to use a design technique to manage the ring that could be inherent to their own particular operation or capability.

Token Ring Signaling Methods

The Token Ring physical medium signaling method uses differential Manchester encoding, which differs from the Manchester encoding used in the Ethernet LAN environment. The coding occurs at the physical layer. Simply put, there is a binary 1 and 0 reverse process cycle that utilizes a blended cycle against a binary code of nonreturn to zero and eventually results in differential Manchester encode—and thus the Token Ring physical signal used on the cabling medium (see Figure 9.7).

The Token Ring physical signal method.

Figure 9.7. The Token Ring physical signal method.

Token Ring 4Mbps Technology Versus 16Mbps Technology

One of the last areas to discuss before moving into the internal operations of the Token Ring architecture is the comparison between 4Mbps technology and 16Mbps technology. Because there was not a heavy implementation of 32Mbps and higher Token Ring data rates, this discussion concentrates on comparing 4Mbps and 16Mbps technology.

The main difference between 4Mbps and 16Mbps technology is that a 4Mbps Token Ring frame has a maximum frame size of approximately 4500 bytes. In a 16Mbps technology a NIC has the capability to process a frame up to 18000 bytes in size. It is common to see Token Ring frame sizes at the 4Mbps size (4500 bytes), even in a 16Mbps environment. Increasing the packet size allows for the creation of a much larger PDU for transmission (see Figure 9.8). Most of the network operating system vendors along with the application vendors that utilize Token Ring for physical transmission never saw a strong requirement for increasing the PDU size, because most endpoint Token Ring nodes did not have the capability to process the data at such a large size because of data rate-handling restrictions. This is the reason that the PDU remained in the 4500 byte frame size area, even on a 16Mbps data rate network (see Figure 9.9).

A token Ring network not overutilized and not in need of a capacity.

Figure 9.8. A token Ring network not overutilized and not in need of a capacity.

A token Ring network not overutilized and in need of a capacity upgrade

Figure 9.9. A token Ring network not overutilized and in need of a capacity upgrade

The other key difference is that the 16Mbps data rate has a much higher frame-per-second rate-generating capability on board the NIC. The actual frame-per-second rate capability between a 4Mbps and 16Mbps card varied from vendor to vendor. The fact is that a 16Mbps data rate card has a much faster processing rate compared to a 4Mbps data rate card, as related to frame-per-second rate.

Another noticeable difference is the actual implementation of the Early Token Release (ETR) technology. This was introduced around the same time that the 16Mbps technology was introduced. ETR technology allows for two frames to travel on the ring in one given instance. This is not two data frames, but just two frames.

In an ETR transmission, for example, the sending station grabs a token off the ring and forms a data frame through the append operation and sends a frame to the destination station. The destination station at that point starts to process the frame. During this time cycle, the source station immediately releases the token. This differs from the process engaged in the original 4Mbps design, in which the token was held until the original frame was transmitted back for interrogation and the token was released. The early released token is then sent out to the ring for any station waiting to transmit. The station wanting to transmit can then grab the token and start to build a frame for transmission on the ring. This is similar to a bank drive-in operation with multiple drive-through lanes, where one client is processing a transmission with a bank teller in the window, while another person is preparing his financial transaction in the drive-through system. Moving to the technical aspects, the original destination station receiving the frame processes the transmission and then sends the frame back to the original source station. If the original source station receives the frame within a time period considered valid, it does not move into an abort sequence cycle to stop the transmission of the original frame that was sent outbound upon the early token release cycle. Therefore, the station that received the token early can append data and transmit a frame for a faster process (see Figure 9.10).

The Token Ring ETR concept.

Figure 9.10. The Token Ring ETR concept.

The ETR process effects an increase in speed of the overall Token Ring operation. Note, however, that if the original source station does not receive the frame back from the destination station for any reason, such as ring latency or the fact that the frame was not received by the destination station, the source station aborts the initial ETR transmission. This event is called an abort sequence cycle. Timers called Token Ring protocol timers are active and manage this complete cycle for general operations.

Note also that in a network architecture using ETR, all the Token Ring cards should be active or inactive for early token release to operate in a true enhanced process. In many implementations, part of the NICs implemented in certain Token Ring networks may have early token release as active and other cards do not. Such a situation can cause an imbalance in Token Ring timing that can cause problems at the physical layer, such as ring recovery or ring purge operations and unstable neighbor notification rates.

Note that in most cases, so long as the split in ring balance is not excessive, such as if one or two stations are using early token release and the rest are not, this is not a major issue. A Token Ring network with an ETR mixed design implementation becomes a timing problem issue when the split approaches a 50% to 50% level on ETR active and ETR not active.

To explain further, the reason for the mix of early token release technology is vendor release cycles. In 1989 many of the Token Ring NICs were released with early token release as optional. In 1991 through 1992, many cards were released with the early token release option as active and set as the default. In mid-1994 to 1995, many cards were released with early token release as active and set for hard default—that is, the configuration was permanent and could not be changed. In this case, the industry just implemented Token Ring cards as they purchased them, and the phenomenon occurred in which Token Ring networks were deployed with NICs applied with a mixed setting of ETR both on and off.

Through many different troubleshooting processes in the late 1990s, the LAN Scope analysis team has addressed this concern by advising many clients to evaluate the Token Ring network through the process of examining neighbor notification timing and examining ring recovery processes to determine whether there is an early token release mismatch. If neighbor notification timing is varied of center of 7.00 seconds and shows 6.5- to 7.5-second area variances and there are high counts of ring recovery occurrences processes, these may indicate an ETR mismatch. If an early token release mismatch is suspected, verifying the process requires a complete audit of the Token Ring NICs throughout the Token Ring network area.

Token Ring DTR and Switching

Token Ring technology advanced quickly in the late 1990s. Today, Token Ring switching provides a low-cost way to allow a Token Ring network to be enhanced through higher performance.

The Token Ring switching products used today were developed through a migration of Dedicated Token Ring (DTR) switching. A DTR switching platform allows capacity to be increased by allowing a certain ring station to have a dedicated port for access. A single ring can then be increased further by engaging full-duplex adapter technology. Token Ring switching modules within an intelligent hub architecture allow for creating internal Token Rings to the specific hub. The switch process design allows an implementer to design more rings, by dividing ring stations across separate and internal rings. This process enables a designer to manage bandwidth allocation. The DTR process allows for a separate ring to be designed for one station. The lobe-cabling link can be engaged as a separate Token Ring loop between a station NIC and a port on the Token Ring switch. In this mode of Token Ring switching, stations normally operate in half duplex. A device can also use a full-duplex NIC if the switch port has a full-duplex adapter circuit. The full-duplex adapter can engage all four wires in the lobe cable and allow the dedicated loop to operate at 32Mbps.

Advancements in Token Ring switching and DTR have introduced the capability for overloaded rings to be divided into much smaller rings. The Token Ring product vendors have now introduced the capability to increase the data rate for Token Ring to more than 100Mbps. Implementing this type of Token Ring technology is very costly and requires certain management schemes.

Main Token Ring Frame-Type Structures

Another important feature of the Token Ring topology is that the Token Ring frame structure is built on three different frame types that can interact with each other on a consistent basis. This differs from Ethernet, which has four different data frame types that can be involved in communication from one point to another.

One of the frame types in the Token Ring topology is the Token Frame, which is a 3-byte frame that circles the ring for ring station access and control.

There is also a Token Ring data frame type that is divided into two sub- categories:

  • Token Ring data frame with vector Data category

  • Token Ring data frame with vector TR MAC category

The Token Ring data frame carries actual upper-layer data, and the Token Ring Data frame with an internal MAC header carries actual data related to the physical Token Ring medium for control. The third frame type is a Token Ring Abort Sequence frame, which can be used for an abort operation such as the ETR abort cycle.

Devices That Can Manage the Physical Ring

In the initial release of the Token Ring network, certain devices were deployed with functional addressing active for physical management on the ring. This was prior to the implementation of SNMP and RMON technology in the Token Ring environment, and represented a communication method for managing the physical ring. Much of the functionality of these roles was based on the use of a management system introduced by IBM, called the LAN Network Manager System. Note, however, that two of the initial physical functional address assignments are still common throughout the Token Ring network today: the Standby Monitor and the Ring Error Monitor. Many of the other initially assigned functional addresses are not present in today's operation, unless implemented in a proprietary environment such as an IBM host structure.

For the ring to operate in a consistent and stable fashion, certain management roles were applied to the Token Ring technology. Local management roles were roles considered important to the local ring. Ring management server roles were roles of management that would synchronize and operate throughout different rings that were communicating with each other through Token Ring bridges or routers.

On a local ring, there was always the requirement for a standby monitor (SM) role. An SM is a device with a functional operation that runs in tandem with a local individual addressing scheme. All devices on each ring are called standby monitors. This is because almost any device actively inserted and running on a Token Ring network is active in the SM role and is always "standing by" to become the active monitor (AM).

The next important role is the AM. The AM is the most important role of a physical local ring management process. A device running as an AM can also function as a general ring station. The AM functional address can be assigned and run in direct parallel with the combined assignment of the individual assignment of the local device. Specifically, this means that a server, workstation, router, or even a LAN printer could act in the AM role. The active monitor is the main communication manager on the local ring and is designated the AM role through an operation called the token-claiming process. This process allows the AM role to be assigned dynamically and is noted as the first device active on the ring with the highest address. In most cases, this is the server. The token-claiming process is described later in this chapter.

The device assigned the AM role will have seven responsibilities. These include maintaining the ring master clock, initiating neighbor notification, monitoring neighbor notification, maintaining proper ring delay, monitoring token and frame transmission, detecting lost tokens or frames, and purging the ring. This is obviously an extremely important process for overall Token Ring management cycles. The processes related to the discussion of these roles will be clear as we move further in this chapter and discuss the architecture cycles of the ring (see Figure 9.11).

The main responsibilities of the Token Ring active monitor.

Figure 9.11. The main responsibilities of the Token Ring active monitor.

The next role that was important in the initial release of the Token Ring physical management processes was called configuration report server (CRS). This CRS implementation was introduced to allow for certain configurations to be immediately cross-loaded from and to certain ring stations when management was required on the ring. Some of these configurations were items that would be collected for statistical purposes, such as NAUN changes or new active monitor MAC frame transmissions.

Another management function also in the past was the ring parameter server (RPS). An RPS is a special server that would run in combination with a local assigned address where one device would be the RPS. The RPS had the capability during ring initialization cycles, when a workstation would insert on a ring, to download to a new station a logical ring number, a soft error report time value, or a ring parameter version level. It could also monitor the ring station by querying the address, the microcode level of the NIC, and the NAUN's address. This information was combined within the RPS function.

Another function still used today and also assumed by many protocol analyzer management systems is the ring error monitor (REM). The ring error monitor has the express purpose of collecting physical Token Ring errors. This is why it is so important to the management system and the functionality of the network protocol analyzers. The REM can collect Token Ring soft and hard errors generated by the Token Ring inherent finite state machine operation of a Token Ring NIC. During certain processes in a Token Ring NIC, soft errors are generated prior to a hard error failure. These processes are described later in this book. The point here is that the REM is a functional address that is assigned. After this address has been assigned, it is combined directly with an individual device. Any device assuming the REM function can capture all errors on the ring, whether they are soft or hard. Again, this is an address that is still important to today's operation of the Token Ring environment.

The LAN bridge server (LBS) function was implemented in the early Token Ring operational stages to allow all Token Ring bridges to intercommunicate bridge statistics for the number of frames transmitted through a bridge and lost and discarded frames transmitted as processed.

Another function is the LAN reporting mechanism (LRM). This function can almost be compared to a Management Information Base (MIB) in today's SNMP environment. This was a function that could be implemented on any device throughout a Token Ring physical network. If a device was running an LRM function, it could collect certain statistics and communicate back and forth with a LAN manager console (thus, the comparison with an SNMP console communicating with a MIB). In this case, an LRM agent was considered for implementation as a functional operation to be designed into all Token Ring NICs and management hubs. This advancement in technology was never really brought to its true potential.

The next main function is the LAN manager function. It is considered the pinnacle of all the management functions of Token Ring architecture and was used for a period of time but in today's environment is not implemented heavily, The IBM LAN network management concept was extremely innovative and allowed for a device to be assigned the LAN manager functional address. This device would be the centralized management system and would communicate with the configuration report servers, ring parameter servers, ring error monitor, LAN bridge servers, and any devices running a LAN reporting mechanism.

Therefore, a complete inherent spider architecture was developed comparable to the SNMP or RMON technologies to allow for physical management in the Token Ring. The interesting fact about this complete phenomenon is that back in 1985, the developers of Token Ring had the ability to physically manage the complete Token Ring environment within the physical layer operations of the overall network topology without involving the upper layers. Many of the network operating system vendors and application vendors did not see the inherent strength of this design. If the Token Ring vendors along with the network operating system and application vendors worked together closely, it is very possible that this management could have reached a higher potential through the implementation of LRMs. Thus, Token Ring networks could have achieved a higher level of respect in terms of network management capability in today's operating system environment. This is a definite fact, and one of the weak spots in Token Ring development as the product started to move forward during the early 1990s. It is therefore my technical opinion that if the network management cycles of the physical Token Ring capabilities of the Token Ring topology were developed in a more aggressive manner, the Token Ring architecture would be much more popular from an implementation and usage standpoint.

Because of the lack of attention to this area as well as other design areas, such as intelligence in the Token Ring hub architecture, other LAN topology products advanced more rapidly in terms of deployment against the Token Ring topology. The other major concern was the speed of Token Ring, which also would have had to be considered by the design teams while also concentrating on physical Token Ring management.

The Token Ring architecture is obviously a very stable and fault-redundant operational technology. To illustrate how strong the fault operation is, the following is a discussion of the actual architecture cycles and processes that occur at the Token Ring physical communication level (see Figure 9.12).

The Token Ring management scheme that was designed for the physical layer.

Figure 9.12. The Token Ring management scheme that was designed for the physical layer.

Understanding Token Ring Physical Communications

Certain communication processes occur at the Token Ring physical level. These processes occur as part of the Token Ring physical management cycles inherent to the Token Ring topology fault-redundant design.

The following Token Ring processes are described here:

  • Ring insertion

  • Token claiming

  • Priority access

  • Neighbor notification

  • Ring purge

  • Beaconing

  • Finite state machines

  • Token Ring protocol timers

Ring Insertion

Ring insertion is a five-phase process that occurs to ensure the attachment of a new ring station to the physical ring. Although there are five steps, they are known as Phase 0 through Phase 4 (as part of an IBM naming convention).

In Phase 0, the Token Ring NIC sends a physical signal to the Token Ring MSAU port. This signal activates a mechanical relay in the port. The port then activates as open. At this point, an active lobe link is considered in place between the Token Ring NIC and the Token Ring MSAU port. A simple Lobe Test MAC frame is released from the NIC and transmitted up to the port and looped back down to the NIC connecting. This signals that the NIC port is active and available for communication. The lobe test MAC frame cannot be seen by a protocol analyzer on the main ring path (see Figure 9.13).

Phase 0 of the Token Ring insertion process.

Figure 9.13. Phase 0 of the Token Ring insertion process.

In the second step, Phase 1, the Token Ring NIC listens or attempts to sense an Active Monitor Present (AMP) MAC frame. This is one of the 25 MAC frames described later in this chapter. In this phase, a new Token Ring NIC upon insertion onto the ring listens for an AMP frame. This frame indicates that an AM is present on the ring. If there is no AM device present, the new station's Token Ring NIC vectors into a state in which it eventually generates a new Active Monitor Present frame and becomes the AM. If an AM is already operating on the ring, the new device logs the vector to not send an AMP frame, and it moves to Phase 2.

In the third step, or Phase 2, the Token Ring NIC transmits a frame up the lobe path and onto the MRP of the Token Ring, called a Duplicate Address Test MAC frame. This frame has a source address that equals its own address, and a destination address that also equals its address. Therefore, the frame is transmitted on the ring and circles the downstream fashion of the ring, and should be received by the source station attempting to be inserted on the ring. This would ensure that there is no other station on the ring noted with a duplicate address. If the frame does not come back to the source station, another station must have received the frame and will be the assigned duplicate device on the ring. This particular device copies a frame in error and generates a soft error frame called a Frame Copied Error. A device attempting to insert on the ring will stop the ring insertion process as active and will attempt to reinsert. If this process is not resolved, it can be a cyclical ongoing process, and the device will never insert on the ring. This can be viewed with a protocol analyzer just by monitoring the ring insertion process (see Figure 9.14).

Phase 2 of the Token Ring insertion process.

Figure 9.14. Phase 2 of the Token Ring insertion process.

The fourth process, or Phase 3, of inserting on the ring is referred to as participating in neighbor notification. In this case, the new ring station participates in neighbor notification. This process occurs every seven seconds and is described later in this chapter. When neighbor notification is active on the ring, the new station just sends out a standby Monitor Present MAC frame and lets all the other stations know that it is active on the ring. It also records its nearest active upstream neighbor address in its UNA buffer upon the cycle (see Figure 9.15).

Phase 3 of the Token Ring insertion process.

Figure 9.15. Phase 3 of the Token Ring insertion process.

The last process of ring insertion, or Phase 4, is called request initialization. During this process, the new ring station inserts onto the ring and sends a Request Initialization MAC frame outbound. This frame was used heavily in the early phases of Token Ring operation for ring insertion cycles, in which a ring parameter server would download certain initialization parameters to the station. In today's environment, hub management systems can take advantage of this particular Token Ring specification outbound cycle; in most cases, however, they will not invoke any retransmission back to the station. In this case, most of the new Token Ring NICs just continue to operate on the ring as normal. At the end of Phase 4, the station is now considered active and operating on the ring (see Figure 9.16).

Phase 4 of the Token Ring insertion processes, which indicates that a new ring station is now actively inserted into the ring.

Figure 9.16. Phase 4 of the Token Ring insertion processes, which indicates that a new ring station is now actively inserted into the ring.

Token Ring Claiming or Active Monitor Contention

A standard physical process occurs called Token Ring claiming, which, for simplicity purposes, could be called active monitor contention. The actual pecification, however, is referred to as Token Ring claiming.

This process occurs when a new device on the ring, or an existing device on the ring, contends to become the AM. This is the process of how a device is assigned the dynamic active monitor role.

Token claiming takes place when one of the following three conditions occurs:

  • When a new station attaches to the ring and does not detect the active monitor

  • If a standby monitor detects the absence of an active monitor on the ring and cannot detect an Active Monitor Present frame for a certain amount of time as related to two timers: the T-Good Token or T-Receive Notification

  • If an active monitor cannot detect any frames on the ring for a period of time related to a timer called T-Receive Notification

When any one of these three circumstances occurs, the active monitor contention cycle starts, and the token-claiming process is active.

A station can operate in two main modes as active when in the token-claiming process:

  • Claim token transmit mode

  • Claim token repeat mode

In a claim token transmit mode, an actual device is contending for the role and is transmitting certain frame cycles to become the active monitor, and is in contention to become the active monitor.

In a claim token repeat mode, a station is aware that the active monitor contention cycle is occurring, but is not contending to become the active monitor.

The token-claiming process determines whether a station is in a transmit or a repeat mode and is based on the following operation. An originating station detects that there is an Active Monitor Present frame problem, based on one of the three conditions previously described. In this case, the originating station generates a Claim Token MAC frame on the MRP, which is one of the standard 25 MAC frames. This frame includes its address in a Data field that has a Claim Token MAC vector.

Every station receiving the frame must investigate a Frame Control field with a MAC vector, and investigate the type of process and data inside the Data field. Therefore, when the frame is transmitted outbound onto the ring, the next downstream station investigates the frame. It compares its address to the address of the Source Station field of the frame transmitting the Claim Token MAC frame. In this case, if its own station address is lower, it drops into a claim token repeat mode. The next station down the ring performs the same cycle. If its station address is lower, it also drops into a repeat mode and appears passive in the process. If the third station downstream compares the source Claim Token MAC address to its address and determines that it is higher, it engages a process in which it is active in a claim token transmit mode and is going to contend for the role. At this point, however, it waits and lets the frame continue to be passed around the ring.

Considering that all the rest of the stations on the ring do not have a higher address, eventually the original frame reaches the frame that transmitted the original Claim Token frame. This original source station then releases the token for one cycle. This station remains active in the claim token transmit mode. This token then circles the ring, and all devices in the claim token repeat mode just pass the token on for processing. The one device that did move into a claim token transmit mode takes the token and activates a claim token transmit process. This Claim Token frame that will be generated by the device contending will eventually reach the source station that originally sent the Claim Token MAC frame. This device then compares its address to the station contending and sees that it is lower, and drops into a claim token repeat mode. Thus, it is lowering its priority and is clearly not going to win the role of active monitor. This process causes the original frame to continue to move back to the second station that contended with a claim token transmit mode, and this station releases the token for one cycle.

If no stations contend for the role after three cycles of this process, eventually this station becomes the active monitor on the ring. Upon doing so, it generates a frame called New Active Monitor Present, which is one of the standard 25 MAC frames (see Figure 9.17).

The token-claiming concept.

Figure 9.17. The token-claiming concept.

Priority Access

Within the Token Ring architecture, there is always an Access Control field contained within the token frame and the data frame. In the Access Control field are three priority bits of note; these have the capability to assign a priority to a Token Ring transmission. Note from the outset that this process is normally not invoked by application or upper-layer operating system manufacturers because priority is normally maintained in the upper-layer protocol sequencing. Note also that it was very innovative in the design that priority is capable at the Token Ring physical level.

If this is invoked and captured on a protocol analyzer as active, it is possible that this could be part of a problem if other applications are not using priority access.

The process is as follows: A basic token frame transmitted on the network has an active Access Control field. The Access Control field contains a set of bits called the priority bits. The priority bits are three bits from 0 to 7 in priority in binary settings. If a station on the ring had an upper-layer process that wanted to invoke priority, it would grab the token and reserve the priority of the token. For example, priority 5 would be a 101. In this case, the token would be released on the ring. If no other station contended for the token at a level higher than 5, the station would receive the token back and take the reserve priority and append data to it and create a data frame through the append operation, and the Access Control field in the data frame would have a priority 5 as active. In this case, this station could transmit and receive back and forth to a destination station at priority 5 on a consistent basis. It would have to consistently release the token on every cycle. If there were another device on the ring that had a priority 6 or 7 contending capability, it could possibly reserve the token and then take the token control away from the two devices transmitting a priority 5. This is not a normal cycle and is not common in normal communications, due to the inherent problems that could occur based on this cycle. But it is a fact that this process can take place. With that said, normally the priority is set at 000 consistently at the physical level. Another priority feature available in Token Ring architecture is the monitor bit. The monitor bit is also in the Access Control field. This bit is set as an active zero on outbound transmission from any Token Ring source device. When this device passes the active monitor on the ring, which always will occur on one particular ring cycle, the active monitor has the responsibility of setting this bit to a one. When this happens, the frame is then passed on through the ring. If the original source station receives the frame back, the token is released on the ring.

But if for any reason the original source station is turned off or has a problem, and the frame is passed on the ring for a second cycle and the active monitor detects a frame when the Access Control field monitor bit is set to one, it discards the frame and then purges the ring. This allows for each station on the ring to have equal priority and to prevent data frames from continuously circling the ring.

Neighbor Notification

Neighbor notification is one of the more important processes in the Token Ring physical operation that interrelates with fault redundancy in the troubleshooting processes of Token Ring. The process is on a seven-second cycle. Due to a timer called T Neighbor Notification, the assigned active monitor sends out an active monitor present frame. This is a consistent operation engaged from the early release days of the Token Ring and is still consistent on a local Token Ring.

The Active Monitor Present frame invokes a cycle in which the Token Ring MAC vector on the next downstream station notes the occurrence of the neighbor notification. In this case, it will record the upstream neighbor address buffer from the source address of the Active Monitor Present frame. The Active Monitor Present frame then circles the ring, and a token is eventually released. The next station down the ring then takes the token and transmits a standby monitor present frame. This is just indicating its particular address to its downstream neighbor, which will record in its Upstream Neighbor Address buffer the Address and the Source Address field of the Standby Monitor Present frame. This continues around the ring until each device records the address of the upstream neighbor on the ring. This causes a neighbor notification event to occur in a seven-second cycle on every ring.

If a protocol analyzer is set for a MAC capture filter and it is connected to a Token Ring physical ring, the neighbor notification cycle is seen immediately on seven-second intervals. It is then possible to determine which is the active monitor and how long it takes for the neighbor notification cycle to occur.

On a healthy Token Ring, a certain amount time is specified for a neighbor notification cycle to occur. This amount of time is called ring poll time. Specifically, it should take no more than 2 to 2.5 seconds for a physically healthy operating Token Ring to perform neighbor notification. It should then recur on a seven-second interval.

If a protocol analyzer detects that neighbor notification is not occurring in seven-second intervals—specifically, longer or shorter intervals than seven seconds—it is possible that a major physical problem exists on the ring. Most likely in this occurrence, other frames will be present, such as physical soft errors and a high number of Ring Purge MAC frames. There may also be a physical Beaconing frame condition present. Some of these conditions usually coexist when there is an unstable neighbor notification cycle (see Figure 9.18).

An analyzer data trace that presents the Token Ring neighbor notification process.

Figure 9.18. An analyzer data trace that presents the Token Ring neighbor notification process.

Ring Purge Process

Another important operation in the Token Ring architecture communication cycle is the ring purge process. The ring purge process is an inherent part of the fault-redundancy cycle. The Token Ring architecture was built in such a way that the device assigned as the active monitor has the capability to purge the ring and to cause a reset across the physical NICs connected to the ring. The reset is a very quick process that resets the physical buffers and usually does not disrupt the upper-layer protocol operation of applications or network operating system functions. This is usually a very rapid process that occurs in less than a one-second interval, during which the complete ring can restabilize. Most upper-layer protocols will not time out or disconnect if the ring purge process does not occur too frequently.

If ring purge processes occur at a rate of more than 50 ring purges per hour, this is considered a possible unstable condition. If they start to occur at 100–200 times per hour, this is an even more unstable situation. When ring purge counts approach 1000–2000 per hour, the physical ring is usually unstable and other upper-layer protocols are disconnecting and applications are generally affected. When ring purges do occur at high levels, it is always a physical condition that has caused the problem and a physical problem should be troubleshot in conjunction with this occurrence. In some cases, excessive upper-layer application loading can affect the process, but usually the physical layer is the main concern of this particular type of occurrence.

The ring purge process generally occurs for certain physical conditions. One condition is when the active monitor has its Any Token Timer expired, noted at 10ms. This would mean that no token or frame has been transmitted by the active monitor in 10ms. Therefore, the active monitor should purge the ring and reset the physical state.

Another occurrence is when the ring recovery process continually occurs and has to be set back to a normal repeat mode after a ring purge process. When the active monitor detects a monitor bit set to one, and a frame has cycled the ring more than once, it is normal to see the ring purged. Another occurrence is any of the error conditions that take place due to the active monitor present role, such as a disruption of timing on the ring and proper execution of a Token Ring process, lost tokens or frames, or other error conditions considered excessive.

The discussion now turns to how the soft error process of fault redundancy of Token Ring is designed and operates.

Soft Error Counting and Fault Redundancy Operation

As noted earlier, the Token Ring architecture is extremely fault-redundant. Built within the Token Ring chipset of every Token Ring NIC is the capability to perform a soft error assembly process when an error occurs and to generate the error out onto the ring. After this process occurs, the Token Ring NIC is supposed to recover and continue operating.

This is an intelligent cycle for error recovery and is an enhanced version of the Token Ring topology features as compared to other LAN topologies such as Ethernet. In other words, the Token Ring NIC can detect an error, package the error for transmission to a station that can log the error, and then recover and continue operating. This is an extremely enhanced feature for fault redundancy.

The process occurs as follows: If any Token Ring station detects an error, it immediately invokes a Token Ring timer called T-Soft Error Report. This timer is normally set at two seconds.

For two seconds, the timer counts, and all the errors that can possibly be assembled are logged into a register for insertion into a Token Ring Soft Error Report packet. At the end of the two seconds, a Token Ring Soft Error Report MAC packet is assembled, and a Token Ring data frame is set on the ring with a MAC vector called Soft Error Report. The Soft Error Report MAC frame is generated onto the ring with an outbound address to C00000000008, called the ring error monitor. Any device assigned as the ring error monitor on the ring, such as a protocol analyzer or a management system, can capture the error.

After the source station has logically transmitted the error, it then resets its physical buffers, recovers, and actually starts operating again as a normal ring station. In certain cases, the active monitor may react to this occurrence depending on level, and purge the ring so that all the stations on the physical ring reset. This is a ring recovery cycle as associated with the soft error generation.

Within the packet, there could be 10 different types of soft errors (described later in the troubleshooting section of this chapter). These errors need to be analyzed closely, because it is possible to isolate a problem before a total Token Ring failure occurs by capturing the error, decoding it properly, and taking action prior to the occurrence. If certain types of Token Ring errors continue to occur, eventually a Token Ring hard error may occur, called a Token Ring beacon. When this takes place, the complete Token Ring physical NIC operation ceases, and upper-layer protocols cannot flow.

When Token Ring soft errors are minimal (certain types are not considered high-impact errors), the Token Ring recovers and upper-layer protocols are not interrupted and can continue to operate normally. This is obviously considered a positive situation.

When this complete process takes place and the Report Soft Error MAC frame is communicated on the ring, the frame contains important informational items. These items include the device generating the error, along with the error type and the address of the upstream neighbor. This is where the neighbor notification cycle is so important, because the device generating the error has a transmission process outbound stating that it has seen an error and its address is involved along with its upstream neighbor address, and here is the error type. When interpreting this occurrence, it can be seen that this is a process that creates a circle around the possible area of fault in the ring. This is called a fault domain and is described later in this book.

This illustrates the importance of the Upstream Neighbor Address buffer logging during the neighbor notification cycle. If dataflow is normal from station to station in a clockwise fashion, and eventually one station device has an error generation outbound, the problem is most likely within that device or the device upstream (its NAUN) in the area of fault. Specific fault domain troubleshooting techniques are discussed later in this chapter.

Beaconing

The last process to be discussed before moving into the physical Token Ring frame structure, Token Ring protocol timers, and the troubleshooting and associated baselining techniques, is the beaconing process.

The beaconing process was designed into the Token Ring architecture as a way for Token Ring cards to automatically recover from major physical Token Ring problems and remove a physically bad NIC from the ring. Note, however, that the process was never truly enhanced through network design features that were possible in the engineering of Token Ring cards from certain vendors.

If this area were addressed, this is another area where the Token Ring architecture implementation node count would have increased and eventually become more popular. This is another weak point from the Token Ring design camp and is considered a major weakness in the overall troubleshooting process of the Token Ring architecture. It is my position that if the vendors and associated engineering teams had paid more attention to the operation of this particular feature, they could have designed the most fault-redundant network in the history of local area technology.

The specific process is that there is a cycle in which a Token Ring NIC can detect a hard error. To do this, the Token Ring NIC has an internal operation occur during which its finite state machine and internal operations detect an error that is nonoperational or considered hard. This particular error is detected through an excessive count of soft errors or other events that take place on the Token Ring NIC, or its associated NIC, or within a fault domain around a particular NIC generating the hard error. This hard error is called a beacon. What occurs is that the Token Ring NIC, when moving into this state, generates a packet called a Physical MAC packet with a Beacon MAC vector. There are four beacon MAC types, and all four usually generate a Token Ring failure occurrence on the ring that will cause an outage.

The Token Ring Beacon MAC frame is formed, and a frame called a data frame is transmitted onto the ring, with a Frame Control MAC vector active and a beacon indication inside the Data MAC field. When any frame on the ring downstream receives the Frame Control field with MAC active, it investigates the field and sees the beacon as active.

At this point, it moves into an immediate beacon repeat state and stops operating any upper-layer protocol transmissions. This obviously interrupts all upper-layer protocols and operations on the ring, and stops the rings from operating. This continues for every ring station on the complete ring, and they all move into a beacon repeat state. After eight transmissions from the original source station in the beacon transmit mode process of the original beacon frame in beacon transmitter process, a continued cycle occurs in which eight transmissions are considered the completion of the cycle. After eight transmissions from the beaconing station of the Beacon MAC frame, an isolation process occurs in the circuitry and the overall operation of the Token Ring mechanism. What takes place is that the station beaconing notifies within its transmission who the NAUN is as associated with its connection to the Token Ring network architecture. The NAUN detects the eight cycles of the beacon transmission and immediately removes itself from the ring and starts a testing process. It runs a lobe test MAC frame and a duplicate address test frame process that are part of the normal ring insertion process, as noted earlier.

These two frames test the lobe path and the capability for the frame to transmit one frame around the ring. Then this station, if completing the test, puts itself active back out on the ring in a beacon repeat mode (see Figure 9.19).

The Token Ring beaconing concept.

Figure 9.19. The Token Ring beaconing concept.

Next, the station downstream transmitting the beacon frame performs the exact same process. If it performs the same process and passes the Lobe Test MAC frame and the duplicate address test without failure, both of the stations appear to be stable and the source station to transmit the frame also inserts itself back on the ring as normal. At this point, the active monitor activates a ring purge process and the ring is cleared, restabilized, and starts to transmit a token as normal. In 99% of all cases, the beaconing process recurs almost immediately and continues on an ongoing cycle. This is because the testing cycle in almost all cases is not extensive enough to actually detect the error. If one of the stations had detected an error during the Lobe Test MAC frame or the duplicate address test, it would have removed itself from the ring as part of the normal beaconing operation. In this case, the station that is bad would have been removed and the beaconing condition would have been physically isolated. The ring would continue to operate on its own and a physical device would have been troubleshot and removed by an automatic process of the physical medium.

This is a very positive fault-redundant feature. For the record, in most cases, this is not enough troubleshooting of physical circuitry. More diagnostic tests should have been performed by the two stations in the physical engineering mode of the beaconing process—such as internal diagnostic runs and different cycles that may have been available—to truly remove the station from the ring so that the process would not have to have been troubleshot manually.

In most cases, an engineer or analyst will actually have to associate the two stations that are beaconing and remove them from the ring. This can be done by using a protocol analyzer. In most cases during the period from 1985 through the early 1990s, this was the only way to isolate the beaconing stations on the ring.

The only other way to troubleshoot the issue was by physically locating the devices that were removing themselves from the ring, and just removing the cables from the MSAU.

In the mid 1990s, intelligent hub-based systems were introduced; these have the capability to detect that the two stations generating the process would perform an automatic vector outbound generation of a Remove Ring Station MAC frame and wrap the ports on the actual Token Ring ports that were causing the beaconing conditions. Through this process, the Token Ring management vendors found a way to address the concern by wrapping the ports quickly so that a technician would not have to find the Token Ring stations causing the beaconing condition. Although this addressed the issue, for the record this was a little bit late. The community that was utilizing Token Ring products from 1985 through 1995 had to incur many troubleshooting problems and outages prior to the implementation of automatic port fault wrapping for beaconing conditions. This is again a major weakness in the evolution of Token Ring design.

Token Ring Timers

The Token Ring protocol has 14 timers used in the Token Ring architecture. This discussion does not examine the interoperation of each Token Ring protocol timer. Note that within the design of a Token Ring card there is an operation called a finite state machine. A finite state machine is an interesting operation that is active in the Token Ring architecture and allows for the Token Ring card to inherently interrelate the mode of a Token Ring communication process in direct conjunction with a Token Ring failure mode. Many different finite state machines can occur and each Token Ring vendor can invoke a finite state machine differently. This is actually the process of how a Token Ring card will vector from one point to another and act as related to an error or transmission.

With that said, it should be mentioned that there are different protocol timers that allow the cards to operate. These timers vary in data-rate design as related to different timing intervals. For the purposes of this book, the following timers are active. Refer to Appendix B for other sources that explain timer operation. A top protocol analyst in a Token Ring analysis session should understand the operation of the internal cycles of these timers. They are extensive and require considerable reading and study to truly understand their operation and effect on Token Ring network operations.

The following are the key timers:

  • T_Attach

  • T_Claim Token

  • T_Any Token

  • T_Physical Trailer

  • T_Good Token

  • T_Response

  • T_Soft Error Report

  • T_Transmit Pacing

  • T_Beacon Transmit

  • T_Escape

  • T_Ring Purge

  • T_Neighbor Notification

  • T_Neighbor Notification Response

  • T_Receive Notification

Note that all the timers have one common thread. Each timer activates at a certain point, and each timer has an action. There is always a condition that cancels the timer and a certain duration during which the timer runs.

For general discussion purposes, a few examples of the key timers are presented. The T_Any Token is a timer used to set the amount of time that an active monitor can wait before it detects a starting delimiter sequence from either a token or a frame on the ring. If the active monitor does not see any type of starting delimiter from a token or a frame, it assumes that the ring is not operating. It times out after 10ms and activates an automatic ring purge condition. Another example is the T_Attach timer, which is used to set the amount of time that a timer can stay in the ring insertion process. This timer is activated in Phase 1 when the monitor check process occurs in ring insertion. The timer times out in 18 seconds if the ring insertion process encounters any problems and is not completed. The timer is cancelled earlier if the process completes. When the timer times out, this reactivates the ring insertion process. These are just a few examples of how the Token Ring timers operate.

Token Ring Frame Structure

Three types of frames are used for communication on the Token Ring network:

  • The token frame

  • The data frame

  • The abort sequence frame

The Token Frame

Three fields are engaged in the Token Ring frame: the Starting Delimiter field, the Access Control field, and the Ending Delimiter field. The starting delimiter is a sequence of approximately 8 bits, 0–7. These bits allow the code to be sensed on inbound and outbound transmissions on the Token Ring NIC. The ending delimiter is the ending portion of a token frame that is an 8-byte sequence built on a 0–7 bit cycle for transmission inbound and outbound of a NIC (see Figure 9.20).

A token frame.

Figure 9.20. A token frame.

The Access Control field includes 8 bits. Bits 0 to 2, the left three first positions, are called priority bits. The middle or fourth bit is called the token bit. The fifth bit is called the monitor bit, and the last three bits are called the priority reservation bits (see Figure 9.21).

The internal field of an Access Control field.

Figure 9.21. The internal field of an Access Control field.

The Data Frame

The Token Ring data frame has multiple fields: Field 1 is the first byte and is called the Starting Delimiter. Field 2 is the Access Control field. Field 3 is called the Frame Control field and defines whether the frame is a data frame with Token Ring physical Medium Access Control data, or upper-layer data called Logical Link Control data. Frame 4 is a 6-byte address for the Destination Address field of the Token Ring NIC. Field 5 is the 6-byte Address field for the source address of the Token Ring frame transmitted. Field 6 is the Routing Information field. This is a variable-length field from 2–18 bytes. Field 7 is a variable-length field that identifies the actual Information field that carries the data, such as the actual upper-layer protocol data or the physical Medium Access Control field. Field 8 is the 32-bit CRC field. Field 9 is a 1-byte sequence indicating the Ending Delimiter field, and field 10 is called the Frame Status field, which includes whether a frame was copied and a frame was understood for frame processing (see Figure 9.22).

A Token Ring data frame.

Figure 9.22. A Token Ring data frame.

The Abort Sequence Frame

The abort sequence frame is a simple starting delimiter and ending delimiter tagged together and is considered a 2-byte sequence for general transmission (see Figure 9.23).

An abort sequence frame.

Figure 9.23. An abort sequence frame.

Description of 25 MAC Frames

The Token Ring protocol architecture has 25 MAC frames that are used for communication at the physical layer, as follows:

  • Standby Monitor Present. . This frame is used by neighbor notification for a ring station to notify its address and to log its upstream neighbor address during neighbor notification.

  • Active Monitor MAC Present. . This frame is used by the active monitor present to start the neighbor notification cycle.

  • Ring Station Initialization MAC. . This is a frame sent outbound on the ring and in the station insertion process by a new ring station entering the ring.

  • Initialized Ring MAC Station. . This is a frame sent by the ring parameter server if a certain ring station inserting on the ring is going to require certain ring initialization parameters.

  • Lobe Test MAC. . This is the frame used in the ring insertion process to test the lobe path.

  • Duplicate Address Test MAC. . This is the frame sent during the ring insertion process and during the beaconing process to test the ring for duplicate addresses on the ring, and to also, during the beaconing process, test the overall capability for a station to transmit around the ring in one cycle.

  • Beacon MAC. . This is the frame indicated upon transmission of a hard error of fault occurrence.

  • Claim Token MAC. . This is the frame used by a station that detects the loss of an active monitor and wants to contend for the role of an active monitor cycle.

  • Ring Purge MAC. . This frame is transmitted by the active monitor and causes every physical station on the ring to reset its physical layer and causes ring recovery to take place.

  • Report Neighbor Notification MAC Incomplete. . This frame is transmitted on the ring during the neighbor notification process and indicates the situation where the NAUN T_Neighbor Notification timer has expired. In this case, this means that the neighbor notification cycle could not complete and there may be a fault domain isolation process to indicate the error.

  • Transmit Forward MAC. . This is a frame that can be transmitted outbound from a LAN management console or new management-based systems to test the station's availability on the ring.

  • Report Transmit Forward MAC. . This is a station that can be transmitted by a LAN manager console or a new management station from where the NIC must respond to the Transmit Forward MAC frame. The two previous frames work together in a manner comparable to a ping process in the IP environment.

  • Report Active Monitor MAC Error. . This is a frame generated by the active monitor when it detects a problem in process in its own operation. This would indicate a problem in the active monitor cycle.

  • Report Soft Error MAC. . This is the frame generated by any ring station when it encounters an error and packages an error for generation on the ring. This frame includes valuable information and must be decoded by an analyst as to the type of soft error.

  • Change Parameter MAC. . This is a frame generated by the ring station onto the ring, normally when a CRS is active (which is not used in the industry at this time).

  • Remove Ring Station MAC. . This is a frame that can be transmitted by a management station CRS or many of the new management systems that automatically remove a ring station from operation. This causes a Token Ring NIC to automatically deactivate for certain port adapters to wrap on intelligent Token Ring hub architecture.

  • Request Ring Station MAC State. . This is a frame that can be transmitted by LAN management stations and CRS and new management stations to request the state of a MAC frame. If a Token Ring NIC receives this frame, it responds with a following frame.

  • Report Ring Station State. . This frame is transmitted by the ring station in response to the Request Ring Station State. This frame responds with the state of operation, such as functional addresses and other operations active in the ring.

  • Request Ring Station Attachment MAC. . This frame is transmitted on the ring by a new LAN management station or an old LAN manager to request the attachment of a ring station. This would directly relate to the type of functions run by the station.

  • Report Ring Station Attachment MAC. . This is the response from the NIC to a Request Ring Station Attachment frame from the management station.

  • Request Ring Station Address MAC. . This is the twenty-first MAC frame in the subset, in which a management station or a LAN manager can request the address of a station. This is used by new management systems to quickly query an address of a Token Ring NIC.

  • Report Ring Station Address MAC. . A Token Ring NIC must respond with the twenty-second MAC frame type, called Report Ring Station Address, and the address is responded back through this transmission.

  • Report NAUN Change MAC. . This is the twenty-third MAC frame change type. This is when a station involved in a ring insertion area of the Token Ring sees a new NAUN active and reports a NAUN change from its nearest Upstream Neighbor Address buffer.

  • Report New Active Monitor MAC. . This particular frame is transmitted by the active monitor when it becomes the active monitor through the active monitor contention or token-claiming process.

  • Response MAC. . This frame is normally transmitted from one frame to another when there is a syntax error in the respective transmission. This normally occurs upon transmission and ring insertion in abnormal conditions, or in abnormal conditions of physical conditions.

Troubleshooting Token Ring Physical Faults and Errors

When troubleshooting a Token Ring network generating soft or hard errors, it is important that an analyst capture all Soft Error Report MAC frames, Beacon MAC frames, and Ring Purge frames. These types of frames carry the information that holds the main error type and information that may need to be further decoded. These types of frames point to the fault domain. The Token Ring fault domain can be defined as the assigned logical area of hard fault. This area is assigned by the Token Ring NIC-to-NIC communication and architecture. It is inherent in Token Ring architecture for NICs to utilize the beaconing process to isolate a hard failure to a hard fault domain. A Token Ring NIC can also use the report soft error generation to create a soft area of fault.

Analysis and Troubleshooting of Token Ring Fault Areas

A logical area of soft or hard error fault or a fault domain will include three subareas:

  • The station transmitting the Beacon or Soft Error Report MAC frame

  • The beaconing or soft error reporting station's recorded NAUN address

  • The medium used for connection between the error-generating ring station and its NAUN

When examining a Token Ring fault domain area, it is critical to analyze and isolate the actual location of failure cause. To capture, analyze, and isolate a point of failure within a fault domain, an analyst must focus on both Beacon and Soft Error Report MAC frame types via protocol analysis in a network baselining session.

In certain cases, the internal data areas within a Beacon or Soft Error Reporting MAC frame may point an analyst to a specific network device or failure areas on the Token Ring. In other cases, an analyst may have to perform certain manual troubleshooting steps as interleaved with rapid post analysis to isolate an issue identified to a fault domain. When troubleshooting a fault domain, an analyst must remember to examine the Beacon or Soft Error Report MAC frames through a protocol analysis session. To troubleshoot the internals of the frame, the analyst must record the reporting address and the NAUN address to assign a fault domain. Next, the analyst can start the manual troubleshooting by isolating one fault domain to two lobe areas as possible source of failure. The analyst should remove one lobe area cable link from the fault domain assigned by actually detaching one of the lobe cables from the Token Ring hub. Next, the analyst should re-analyze the ring. If the Beacon or Soft Error Reporting MAC frame is gone, the problem is in the removed lobe area. If the frames are still present, the analyst should reconnect the first suspect lobe area and disconnect the second lobe area and re-analyze the ring. The problem frames should not be present any further in the trace.

If a group of devices or a complete Token Ring network is experiencing problems or issues, it is possible that the MRP has an internal failure. If all the MSAUs or hubs are properly configured in the wiring racks and the MRP cross-hub RI and RO out cables are connected properly, the analysis of the MRP is relatively simple.

The proper way to troubleshoot and analyze a suspected MRP problem is to start by isolating the ring to a certain MSAU or hub area. It is important to understand that the main MSAUs, hubs, or the RI and RO cabling can be the cause of failure in an MRP outage. The analyst should start by isolating the first MSAU or hub in the rack by disconnecting its RI and RO cables. Next the analyst should re-analyze the removed and isolated hub area by verifying that the stations connected to it can properly perform basic ring insertion and communicate across the ring. If the MSAU or hubs in the MRP being used in the Token Ring network do have specific diagnostics, they should also be engaged. If the first MSAU or hub area tests positive, the next MSAU or hub area should be reconnected to the hub area just analyzed by re-attaching the RO cable from the first MSAU or hub to the second MSAU's or hub's RI port. Next, the analyst should re-analyze the new ring with two hubs as interconnected. This analysis and troubleshooting process should continue until a problem in the MRP is encountered in post analysis. The problem MRP hub or cabling area will be located, when an area that is reconnected causes a high amount of Soft Report Error MAC frames or a single Beacon MAC frame to be generated. This troubleshooting cycle involves an elimination cause-and-analysis process (see Figure 9.24).

An analyzer data trace showing a MAC beacon event affecting the Token Ring network main ring path.

Figure 9.24. An analyzer data trace showing a MAC beacon event affecting the Token Ring network main ring path.

Analyzing Token Ring Error Reporting and Decoding

When performing a network baseline, one of the key analysis issues is to troubleshoot the physical medium of the LAN. This is usually the fourth or fifth step in the baselining process, as mentioned earlier in this book under "Workload Characterization Measurements."

When involved in a Token Ring environment, certain Token Ring physical errors can be encountered, such as the 10 types of soft errors in a Soft Error Report MAC packet, or a physical Token Ring beaconing condition encountered when decoding a Beacon MAC frame. Troubleshooting these errors requires the use of a protocol analyzer. The protocol analyzer will have to be connected to the network and requires the capability of capturing a Token Ring error types and displaying the error for decode. There is a certain method for capturing and decoding Token Ring errors with an analyzer in such a condition. To actually decode the error types, an analyst needs to understand each error type and the occurrence of each error, and the vector for action if this error occurs.

The following is a description of key Token Ring soft errors along with key Token Ring beaconing errors. These error analysis descriptions should be used in direct association with the baseline process, which is introduced later in this chapter as related to baselining and performing workload characterization measurements in Token Ring.

Soft Reporting Error MAC Frame Analysis

When a ring station encounters a soft error, it increments its internal soft error counter and starts a Token Ring protocol timer (T_Soft_Error_Report). After two seconds, the timer expires and the ring station generates a Report Soft Error MAC frame. High occurrences of certain soft errors can cause ring performance and connectivity problems. Certain soft errors are considered more serious than others. When recording soft errors, an analyst must be aware of the level of seriousness and the possible failure cause for each type of soft error. Certain soft errors can point to an actual network component failure cause. The Report Soft Error MAC frame transmitted actually contains the soft error type, the reporting ring station's NIC address, and its respective NAUN's NIC address. When performing an analysis session for error recording, the proper way to decode a Report Soft Error MAC frame is to note the type of soft error and any associated addresses in the MAC frame, such as the reporting address and the NAUN. The soft error internal field in the MAC frame is a 12-byte field, and 10 of the bytes actually represent soft error types. The soft error types are divided into two main subtypes:

  • Isolating error types

  • Nonisolating error types

Isolating error types report ring station internal error counters as collected that can be isolated to final cause. Nonisolating error types are errors that cannot be easily decoded. The following is brief description of the main error types and the associated analysis methods:

  • Internal error. . This error type identifies that the sending ring station has encountered an internal error. If this error type is recorded frequently, the reporting ring station NIC may be encountering a close-to-failure error. It is recommended to remove and replace the NIC from the device sending station, and the ring should be re-analyzed.

  • Burst error. . This error type identifies that the sending ring station has encountered a signal transition error. This occurs frequently during ring insertion. If this error type is recorded frequently, the reporting ring station NIC may be encountering a bad lobe cable or bad port on the MSAU. It is recommended to remove and replace any questionable components from the device sending station connection, and next the ring should be re-analyzed.

  • Line error. . This error type identifies that the sending ring station has encountered a signal transition error. This occurs frequently during ring insertion. If this error type is recorded frequently, the reporting ring station NIC may have encountered an internal checksum hardware error, and the NIC may be at fault. It is recommended to remove and replace the NIC from the device sending station connection, and next the ring should be re-analyzed (see Figure 9.25).

    An example of a MAC soft error captured from LAN protocol analysis sessions during a network baseline study.

    Figure 9.25. An example of a MAC soft error captured from LAN protocol analysis sessions during a network baseline study.

  • Abort delimiter transmitted error. . This error type identifies that the sending ring station has encountered a recoverable internal error that forced it to transmit an Abort Delimiter frame. If this error type is recorded frequently, the reporting ring station NIC may be encountering a close-to-failure error. It is recommended to remove and replace the NIC from the device sending station, and the ring should be re-analyzed.

  • AC error. . This error type identifies that the sending ring station has encountered a condition in which a frame received from the transmission cycle could not set the address recognized or frame copied bits. If this error is occurring, it is possible that the reporting station's NAUN has a failure. The station's NAUN can be removed, and the reporting station NIC can also be removed to isolate the failure. It is recommended to remove and replace the NICs as required from the devices, and then the ring should be re-analyzed (see Figure 9.26).

    An example of a MAC soft error captured from LAN protocol analysis sessions during a network baseline study.

    Figure 9.26. An example of a MAC soft error captured from LAN protocol analysis sessions during a network baseline study.

  • Lost frame error. . This error type identifies that the sending ring station has encountered an error that indicates that an originating ring station generated a frame onto the ring to a specific address and did not receive the frame back from the destination device. If this error type is recorded frequently, the reporting ring station NIC may not be copying frames properly, or the destination device may be the failure point. It is recommended to remove and replace the NICs involved, and then the ring should be re-analyzed.

  • Receiver congestion error. . This error type identifies that the sending ring station has encountered a situation in which it could not copy a frame addressed to its NIC address. This occurs because of lack of buffer space within the destination NIC and because of low processing resources in a destination station. There can also be low resource design issues on routers and bridges that cause this error. This error is also common when an application is flooding data too frequently to an under-resourced endpoint. If this error type is recorded frequently, the reporting station NIC may require a resource upgrade as to memory, NIC, or NIC driver. The ring design should be closely examined and re-analyzed.

  • Frame copied error. . This error type identifies that the sending ring station has encountered a situation in which it has copied a frame that may have the same address as its own address, like a duplicate address. If this error type is recorded frequently, the reporting ring station NIC may not be copying frames properly, or a duplicate address is attempting ring insertion and is failing. The ring should be analyzed for any device attempting ring insertion with a possible duplicate address assigned. It is recommended to remove duplicate assignments and to replace any suspect NICs involved, and then the ring should be re-analyzed.

  • Frequency error. . This error type identifies that the sending ring station has encountered an attempt to process a frame that does not contain the proper ring clock frequency. It may indicate either a bad active monitor or ring electrical problems. If this error type is recorded frequently, the AM should be replaced and the ring should be re-analyzed. If the issue continues, the ring electrical grounding for the hub racks should checked, along with complete MRP being analyzed.

  • Token error. . This error type identifies that the sending ring station has encountered a token error. This error is generated by the active monitor in the event of other ring issues. Usually, the Active Monitor initiates ring recovery and issues a new token. This error type is common when other ring stations detect and generate burst and line errors onto the ring. Usually, token errors are not an issue. If this problem is continuous in generation, it is recommended to remove and replace the AM NIC from the device sending station connection, and next the ring should be re-analyzed (see Figure 9.27).

    An example of a MAC soft error captured from LAN protocol analysis sessions during a network baseline study.

    Figure 9.27. An example of a MAC soft error captured from LAN protocol analysis sessions during a network baseline study.

Technical Notes on Hard Beacon MAC Frame Analysis

Beacon-based MAC errors are the more critical type of errors and are considered hard. When a hard error occurs, the Beacon MAC frame usually indicates the point of fault domain and the devices involved. On occurrence of a beacon hard error, a Token Ring network takes on the form of engaging the beaconing repeat process, as noted earlier. It is important to bypass the fault area for the ring to operate. The bypass may occur dynamically because of the beaconing cycle built in to the architecture.

Most intelligent Token Ring hubs isolate the beaconing devices and wrap the ports involved. Most protocol analysis tools along with ring monitoring tools enable an analyst to troubleshoot and quickly identify any hard beacon errors causing a ring to experience failure-based issues. The analyst should just record the beaconing device and the NAUN address and assign a fault domain. The devices can be removed and the ring can be re-analyzed. Troubleshooting of the MRP may also apply when the beacon process occurs on a network.

Network Baselining in a Token Ring Environment

Because the Token Ring network is extremely fault-redundant in its own inherent operation, various techniques can be used to immediately isolate a Token Ring physical problem. This is an important fact, because of the general methodology of workload characterization network baseline measurement cycles.

The Token Ring inherent fault-redundant operation enables an analyst to move quickly with a protocol analyzer and capture certain types of packet traffic dataflow. The analyst can then examine the dataflow in such a way as to isolate whether the problem is in the physical Token Ring network or in the upper-layer protocol areas.

In all LANs, it is important that the LAN topology deployed be stable and operate with a strong foundation for upper-layer protocol dataflow to operate properly. This is critical for applications being used on the network along with the operating systems controlling the applications.

When starting a network baseline in a Token Ring environment, the following general processes can be used. This discussion also presents some specific processes that directly relate to some of the Token Ring troubleshooting processes already discussed.

The following sections discuss five steps that outline the methodology for baselining a Token Ring environment.

Step 1: Token Ring Workload Characterization Baselining Methodology

An analyst should immediately deploy a network protocol analyzer to perform a utilization characterization measurement process against a shared Token Ring network area. In this case, as long as it is a shared Token Ring and not a switched Token Ring environment, the ring will show an average and peak utilization that can be quickly measured and noted for general network baseline statistics. The procedures discussed earlier in this book under "Workload Characterization Measurements" fully apply.

In the Token Ring environment, a key factor to take into account is that the topology is deterministic. Because it is not contention-based, it can sustain higher peak utilization for a longer duration before upper-layer protocols time out. Because of this fact, an analyst should not just determine that peak utilizations are not a problem. The point here is that peak saturation levels in the area of five to six seconds could possibly be sustained at saturation levels of 95% and above without upper-layer protocols timing out. This is a condition somewhat determined based on the operating systems and applications deployed.

In most cases, an analyst should always attempt to keep peak utilization on a Token Ring network, when considering segmentation and redesign, in the area of 65% to 75%, maximum. For the record, however, higher peak saturation levels can be sustained on a deterministic medium.

One fact is that the actual speed of access to the medium for each device that has to gain access to the token will be slowed down by higher utilization levels. So with that said, it proves beneficial to just design Token Ring so that it can sustain higher application loading levels with smaller segments if they are going to be implemented in a shared design versus a switched design.

Step 2: Token Ring Workload Characterization Baselining Methodology

When performing network statistical node-by-node utilization, it is always important to monitor on a Token Ring environment what the actual utilization levels are on a device-by-device basis. This is because a shared medium is being monitored. This would not apply to a switched link. In this case, the importance here is that on a shared medium, if there is one device on a Token Ring absorbing an extremely high level, it is possible that it could negatively affect the deterministic token-passing cycle and cause a physical problem or an upper-layer problem. The analyst will be able to quickly identify the device further by performing the third typical workload characterization measurement of protocol percentage breakout.

In a switched Token Ring environment, it is also important to understand that by roving a Token Ring switch and measuring each one of the dedicated Token Ring switched links, it is possible to understand whether proper distribution is applied against the switch. In certain cases, it is possible that a small Token Ring network connected to a switch may have a higher level of traffic on it and may still require further segmentation across another additional switched port. Keep in mind that Token Ring switched ports are not just used for connecting one particular device, but are at times used for segmenting rings; and in some cases, the rings may need to be partitioned further. Therefore, it is important to always perform a node-by-node utilization comparison against a Token Ring switched pattern when using a protocol analysis system or a management system.

Step 3: Token Ring Workload Characterization Baselining Methodology

Protocol percentage measurements are very important in Token Ring. If the 802.5 percentage exceeds 4% to 5%, it is very possible that the Token Ring medium is having a recovery problem. This can be seen very early in a Token Ring monitoring session from a management system or a protocol analyzer. An analyst should be able to understand that the 802.5 percentage is going to be primarily comprised of Token Ring MAC-based packets that are communicating. In a normal healthy Token Ring network, a Token Ring MAC trace should just show neighbor notification every seven seconds. Even in the busiest Token Ring environment, this normally would account for only 2% to 3% of maximum traffic.

If the 802.5 protocols are seen in a protocol measurement screen of a management system or protocol analyzer at percentages of 8% to 10% or higher, this is an immediate flag that there are problems in the physical Token Ring level. Further isolation with the next step needs to be performed for error isolation.

Step 4: Token Ring Workload Characterization Baselining Methodology

Error isolation and ring purge review are the next key areas that require analysis. In the physical Token Ring analysis area, the analyst has to closely examine the Token Ring physical operation after performing the first three steps by applying a capture MAC filter with a protocol analyzer in a capture system of the analyzer or a management system. This just enables the analyzer management system to perform the role of remote error monitoring, capturing Report Soft Error packets along with Token Ring Purge packets as active. Another parameter may need to be adjusted on certain analyzers to capture Ring Purge packets. The key factor is to first monitor the ring purge count on the ring. If the ring purge count exceeds 200 to 300 ring purges per hour, an immediate identification and further investigation of the physical MAC traffic at a detailed level is required. This means that the ring is possibly having a problem on physical ring recoveries that are affecting upper-layer protocol fluency.

The next level is to investigate any Token Ring Report Soft Error packets. As noted earlier, many different Report Soft Error packets may be encountered; however, the 10 specific types that occur for various reasons are important for analysis. The ones that occur at low levels during normal ring operations are Token Ring, line, burst, and token errors. Other errors, such as internal errors and abort errors, are serious and may be an indication that a Token Ring NIC is about to fail completely and move into a finite state machine vector as a beacon MAC vector and cause the complete ring to go down. In this case, an analyst should immediately examine this area.

The way to perform a MAC capture filter is to start the analyzer on a MAC capture filter and to run the analyzer for at least 10 to 15 minutes, stop the capture, and save the trace. The analyst should then open up the trace and examine all the packets. If the physical layer is clean of soft errors and beacon errors, there should be a simple neighbor notification sequence occurring every seven seconds, and the ring poll time should occur for no more than 2.5 seconds. This is a normal cycle. If a high number of Report Soft Error packets area seen along with a high number of ring purges, this indicates nonfluency in the Token Ring physical area.

If a high number of ring management frames such as Request Ring Station State or Address are also seen, this indicates management systems that are active and may be disrupting traffic. In today's environment, new intelligent hubs' management systems introduce some management frames from the older Token Ring structure processes that may at times absorb traffic levels that may be unnecessary if a client is also using coexisting SNMP RMON-based systems.

Either way, when focusing on the physical layer, an analyst should trouble-shoot any Report Soft Error packets that are considered serious and take action as described earlier in this chapter.

If a beaconing condition is found, an analyst should immediately locate the beaconing device and its nearest active upstream neighbor and remove these devices. The medium should then be immediately re-analyzed.

All physical soft error analysis steps apply as noted earlier in this chapter for isolation and cause analysis.

If the physical layer is found to be a problem, the issue should be troubleshot and the network should be re-analyzed starting from step 1 of the workload measurement process, to ensure a clean physical Token Ring that shows clear neighbor notification and minor ring insertion events are encountered.

The Token Ring rotation time (TRT) timing measurement is also a good indication of how fast Token Ring frames are being processed. The definition of TRT is the amount of time it takes for a single token to circle the ring. The normal TRT levels should be between 5 and 150 microseconds. TRT is an excellent measurement statistic to use if available on an analyzer when performing physical analysis for troubleshooting the general health of a Token Ring network.

Step 5: Token Ring Workload Characterization Baselining Methodology

In this step, the analyst should closely monitor the ring for general Token Ring frame MAC communication processes. This involves basically using the same MAC data trace that was taken in step 4 and further examining the interaction of the Token Ring physical MAC communication. Neighbor notification is all that should normally be seen in a clean Token Ring operation. There will be ring insertion processes when new devices insert on the ring, and possibly a brief set of frames showing a line and burst error sequence along with small occurrences of token errors from the ring active monitor. Other than that, a clean physical Token Ring should just show neighbor notification and ring insertion cycles, and possibly token claim events when new active monitors are assigned. This should not be an ongoing occurrence.

After this step is completed and everything is found to be operating in a solid Token Ring physical state, the next layer to move on to is upper-layer protocol analysis. In this case, if the steps noted earlier were followed properly, this would conclude the methodology for analyzing a physical Token Ring architecture. It should be mentioned that other innovative steps can also be deployed, such as analyzing Token Ring rotation time. Token Ring rotation time should be no more than 125 microseconds and is typically seen in most networks today between 5 and 150 microseconds, maximum.

Closing Statement on Baselining Token Ring Environments

The key methodology in analyzing and baselining Token Ring environments is really based on an analyst's understanding the inherent operation of the Token Ring architecture. That is why so much time has been spent in this chapter explaining the Token Ring architecture. If an analyst understands how a Token Ring network works, a protocol analyzer will be a valuable way to quickly monitor the physical medium system, because Token Ring network operations clearly generate the required information for analysis when failures occur. In other words, the Token Ring Soft Error packets, the Token Ring general MAC communication, and other Token Ring packets such as the Token Ring Purge packet actually indicate how the Token Ring is operating.

If an analyst understands the inherent internal operations of Token Ring architecture, he can quickly determine whether the Token Ring physical layer is a problem.

The analyst can determine whether the topology needs to be addressed in a physical isolation cause area problem or whether a ring partitioning or resegmentation design is required.

An analyst can feel comfortable after analyzing the physical Token Ring layer through step 5. It is important to next move through the baselining process with the standard upper-layer protocol procedures, as noted earlier in this book. These procedures include analysis and baseline steps such as analyzing the size of the packets, effective throughput, response-time analysis, and analyzing the phases of network communication. The next step also includes performing trace analysis at a deep data-decoded level that allows for data extraction of upper-layer problems.

Note that if a Token Ring network is operating in a physically stable manner, any problems being exhibited on the ring can clearly be associated as problems related to upper-layer occurrences on the ring, such as design, layout of application, or NOS operation.

Case Study 12: Token Ring Analysis Problem

The Token Ring architecture is based on an extremely fault-redundant topology design. Fault redundancy is built in to the wiring scheme, inside the NIC design, along with the Token Ring frame communication processes that are also present to allow for Token Ring physical NIC card–to–NIC card end-to-end communication. The complete architecture is considered an intelligent topology as compared to other topologies such as standard Ethernet, in which no inherent physical frame management protocols are engaged, such as Simple Network Management Protocol (SNMP) or RMON technologies.

One of the most interesting Token Ring cases encountered by the LAN Scope analysis team involved a Token Ring internetwork that was troubleshot by our team several years ago. Through the general troubleshooting process as well as consistent site visits for consecutive network baselining, the Token Ring internetwork, which had a significant number of performance problems, is now considered stable, reliable, is and performing at a high level.

The client was a large medical practitioners' office in Washington, D.C. The facility location included different medical offices as well as administrative support offices for the practice. The Token Ring internetwork was based on a three-floor design. Each ring had a specific Token Ring network assigned to each area. The first-floor ring, for instance, was based on the main area where the medical practitioners operated. The second-floor ring was dedicated to general administration, including finance, accounting, and other varying functions. The third-floor ring was dedicated to the executive offices and other general support areas for the medical practitioner's operation.

The original Token Ring configuration for the facility was implemented in 1989. At that time, Token Ring was still considered fairly new in terms of enterprise internetwork design. In fact, this was one of the first 16Mbps architectures to be implemented in the Washington, D.C. area. Not only were the rings based on a 16Mbps architecture design, but they also included a partial implementation of Early Token Release (ETR). Multiple speed ratings of Token Ring are currently implemented against the industry architecture. At the time of this particular implementation in 1989, there were only two speeds available in Token Ring: 4Mbps and 16Mbps. The ETR design is an additional feature where the Token Ring NICs allow for the token to be released earlier after a data transmission, thus speeding up the overall performance of the physical topology for the Token Ring network.

The original configuration in this facility was based on an approximate 90-user design, with 30 users per ring. In 1996, the site began to experience extensive performance problems. When our analysis team was contacted regarding this issue, we immediately deployed our team to baseline the network and troubleshoot any performance problems to cause.

When we arrived at the site, we conducted an entrance briefing with the MIS team members at the site to review the internetwork design. The first thing we noticed was that the user counts had increased to a level where each ring now handled approximately 80 to 120 users. The rings were separated by a somewhat unique bridging/routing process called internal file server routing. Specifically, some of the file servers were used as bridges between the specific rings. The file servers were also being used for general access for logon services, file and print services, and certain application access processes. Upon reviewing the configuration, we also noticed that the site had a mixed set of Token Ring hub types, including the main IBM 8228 design as well as a third-vendor product based on active port operation and other design specifications. It was also noted that the site was using varied types of cabling schemes throughout the facility. This was a result of changes over an extended period of time, during which various types of cabling were installed on an "as-needed" basis to allow additional nodes to be connected to the main ring.

One of the primary complaints that was discussed during the entrance briefing was that users were experiencing sluggish performance and frequent disconnects when using the network. The problem was noted as intermittent, but consistent throughout most application usage at the site.

Based on information gathered during the site briefing, the LAN Scope analysis team immediately developed a project plan to use a rapid baseline process to review each one of the three rings. The initial project plan also called for a thorough review of the performance of the users' workstation applications and the workstations themselves, based on a brief application-characterization exercise from the user area.

On the second-floor ring, we noted that an area was implemented where some of the site file servers were placed on the second-floor ring for general access, and that two additional file servers were engaged to separate floors one and two, and three via internal server-based routing.

The medical practitioners' office in Washington, D.C.

Figure CS12.1. The medical practitioners' office in Washington, D.C.

The initial baseline process was engaged against all three rings. Three specific network protocol analyzers were used in parallel on all three rings. Each ring was first investigated for overall utilization notations for average, peak, and historical bandwidth. Node-by-node bandwidth was next investigated, and then the protocols were measured for percentage distributions on each ring. The physical errors were next monitored based on the Token Ring Soft Error Report (SRE) MAC category, along with a review for possible physical beaconing conditions related to the Token Ring architecture. The Token Ring physical Medium Access Control (MAC) layer frame communication was next investigated using a capture MAC filter on each ring. This chapter presented a detailed discussion of the Token Ring architecture processes.

Based on these initial tests, the following information was found while baselining the environment. Floors one and three showed extremely high ring insertion rates of expected user counts of 100+ users. Many of the user node-by-node utilization measurements showed fluctuating usage of the shared Token Ring bandwidth. The average utilization on the first floor was noted at 38%; however, the peak percentage was noted at the 89% level for a duration of 4.2 seconds. It was noted that the second floor showed an extremely high utilization level also at 96% for a 9.1 second interval. The third-floor ring showed an average utilization of 17%, but with a peak percentage of 94% for a 6.4 second interval. These notations were extremely critical, taking into account the peak saturation levels and the long time durations noted. These high saturation levels were an immediate warning flag that connection-based protocols could possibly time out and negatively affect application fluency upon application access from any user on the first or third floor.

An interesting finding from the workload characterization baseline process was that all three rings showed a fluctuating MAC percentage protocol level of 3% to 9.8% on the protocol layer percentage review. Further investigation yielded that the physical SRE MAC error rate was also abnormal, showing MAC SRE packets generating excessively on the second-floor ring at 8%, and at levels on the first and third ring ranging from 2% to 3%.

Through investigation of the MAC layer on all three floors, the following was noted. Floor one showed an extremely high number of line, burst, and receiver congestion error rates throughout the ring. Ring three showed the same type of condition. The receiver congestion rates on both floors one and three were extremely high as associated with the file server NIC cards that were connecting floors one and three. Specifically, most of the congestion appeared to be on traffic passing in and out of the ring for these two user rings.

When examining the second-floor ring, the Medium Access Control soft error packet analysis along with the general MAC layer review showed an extremely high line and burst error rate, also coinciding with high receiver congestion rates on the internal routing server channels acting as bridges between floors two to one and two to three. It was also noted that there was an extremely high internal error rate generating from three specific stations on the ring.

Based on the workload characterization baseline measurements and the error analysis and MAC percentage findings, the LAN Scope analysis immediately focused our attention on examining the physical layer.

Through a more discrete analysis, we found on the second-floor ring that even the general process of neighbor notification, which should occur in a seven- second interval, was occurring at a three to four second interval at an abnormal rate. The ring purge rate was noted at over 4,000 ring purges per hour. Such an excessive rate usually indicates a ring that will not operate or handle stabilization for normal connectivity or upper-layer application flow. The ring purge rates on rings one and three were also high, at approximately 500 to 600 ring purges per hour.

Through further analysis, we observed that with the neighbor notification rates, even when falling within the normal seven-second interval range, the timing intermittently appeared to fluctuate between 6.9 and 7.1 seconds. This immediately was identified as a possible mismatch related to ETR. It is typically best to have all cards either configured with ETR on, or configured with ETR off. When there is a high percentage of cards split between the two settings, timing mismatches can occur on the physical layer.

Based on these findings, we immediately focused on a way to segment the rings by a percentage level that would allow for a lower utilization level capacity at the peak level, and thus allow for a more stable situation for further analysis.

The internal errors that were associated with the three devices on the second floor ring were identified as NICs that needed to be replaced, with the ring then being re-analyzed.

After reviewing our Level 1 findings from our troubleshooting process after approximately two days into the network baselining cycle, the LAN Scope analysis team presented the following short-term recommendations:

  • We recommended that the customer have all rings split by a 50% ratio. The second-floor ring, which had approximately five servers connected to the ring, was targeted for an additional ring, noted as a server backbone ring. This would introduce a new site-wide Token Ring layout configuration with a total of seven rings.

  • We recommended an immediate migration change for the three NICs generating an internal error. This migration was critical, because it was possible that these cards were causing intermittent beaconing conditions or other failures in the Token Ring area of the second floor. This was a major concern, because one of the NICs was present on a main application server.

Based on our immediate recommendations, the customer moved forward with the following design configuration. A Token Ring internetwork switch was redesigned and brought into the facility; this allowed for separation of the three rings via a Token Ring switched uplink design. All the three main rings were split by a 50% ratio. Each floor in the facility was split into a two-ring configuration. An uplink via Category 5 channeling design was brought to the second floor. A rack-mounted Token Ring switch was configured, and each one of the rings was connected via a proper configuration to the Token Ring switched ports.

On the second floor, the two rings were also brought directly into the switch. A server ring was created off another port for two of the file servers that were used minimally. Some of the site intelligent hubs that were based on standard IBM technology were used for staging the configurations. By reviewing the site layout and documentation, we noted that some of the older non-IBM-type hubs in the facility that were not compliant with the IBM architecture were found to be injecting DC Phantom current abnormally on the main ring path. They were removed from the design.

The critical main file servers at the facility, which were previously routing, were redesigned to a one-card configuration operation and were brought directly into the switch. As a result, these rings had no contention for the token and effectively had a Dedicated Token Ring (DTR) loop off the switch.

The planned design appeared to be positive. The network did initially operate in a stable fashion in a cutover testing session that was engaged prior to the next business week.

The LAN Scope analysis team arrived early on the Monday of the second week of the study, and reviewed the configuration prior to cutover for business that particular day. The ring appeared to show stabilization on all rings at the general physical characterization levels and showed very low error rates across the ring. There were no soft error report packets. All file servers were checked for general operation and configuration. All the file servers, with the exception of file server Main2A, were operational.

We began troubleshooting the file server Main2A with a protocol analyzer connected to the proper switched port. We installed another MSAU and examined the traffic between the server as connected to the Token Ring switched port. It appeared as though the server in question was experiencing problems upon the ring insertion process. We contacted the vendor and were informed that the Token Ring card had to be set for a specific speed and that the port on the Token Ring switch should not be set for Auto Speed Detect. Based on this configuration change and the hard-coding of 16Mbps, the server immediately began to function properly on the ring. We were therefore able to certify the ring for operation.

The users began to access the new Token Ring infrastructure at approximately 10 a.m. on the second business week. All application levels appeared to be operating at a higher performance level. The users stated that the environment appeared to be more rapid in terms of logon response and general application access. Overall, the environment appeared to be extremely stable.

We redeployed our protocol analyzers across all three rings. It was noted immediately that the ring purge rate levels had significantly decreased. The first-floor and third-floor rings showed ring purge levels no higher than 10 ring purges per hour. The second floor ring showed a rate of 50 ring purges per hour, which is minor.

After seeing stabilization, we then proceeded with our normal baseline process. The utilization levels on Ring 1A showed an average utilization of 6% with a peak of 38%. Ring 1B showed an average utilization of 18% with a peak utilization of 52%. On the second-floor rings, 2A and 2B, comparable uses were noted, with Ring 2A averaging 14% utilization and peaking at 39%, and Ring 2B averaging 17% utilization and peaking at 42%. The small server backbone ring for the two minor file servers showed an average utilization of 9% with a peak of 22%. Rings 3A and 3B also showed normal utilization levels, with 3A showing an average utilization of 11% and a peak level of 31%, and Ring 3B showing an average utilization of 19% and a peak level of 61%.

All the other main file servers were monitored for interpacket response time by monitoring their response to requests from the other rings being monitored. There appeared to be no problems present in this area, so these rings were not closely baselined at that point. With all utilization levels appearing to be in check, we noted that protocol percentages at the MAC level were well below 1% on all the site's seven rings.

Our next step was to examine the Token Ring physical error rate on all rings across the facility. All the rings appeared to be stable, with the exception of an intermittent line and burst error rate that ranged between 2% to 3% on all seven rings.

After a further review of the facility cabling, we immediately identified that most of the cabling in the facility was based on IBM cabling Type 1, but there had been some introduction of UTP wiring schemes with media filters that appeared to be noncompliant. Upon further review, we concluded that some of the cabling needed verification.

We continued our testing and noted that the Token Ring frame communication appeared to be extremely stable. We also ran standard application testing from certain rings from specific user areas. On floor one, we tested users accessing the general medical database application for the practitioner operation. The application appeared to launch within a 10-second period, which was noted by the user to be extremely rapid; before LAN Scope's implementations, the same type of launch took almost two minutes. This was just a general note by the user; however, our post-protocol analysis review did show a definite 12-second interval for the launch sequence. Upon accessing the application operating the database, all the general cumulative bytes, relative time sequences, and utilization effects appeared to be normal. We also noted this same positive condition on all other applications that were tested, such as word processing and accounting-based applications on other floors. Overall, we noted a successful outcome for this phase of the baseline process.

By the following business evening, the practitioner MIS team had a cabling audit performed at the facility and found that approximately 13 cables were out of specification. These cables were immediately replaced.

On our final day at the site, we closely reviewed all the rings again for physical testing and found that the line and burst error rate was nonexistent, and at a very low level and only present upon ring insertions.

Overall, this facility was stabilized and brought to a much higher performance level. It is still based on a Token Ring topology design. New protocols have been introduced into the facility based on new operating system deployment and new application deployment. As the site continues to grow, it may become necessary to implement a center backbone network based on a higher-speed platform for capacity channel design. Any future migration changes will be based on application deployment and not on an urgent requirement to stabilize the facility. In other words, now that the facility is operating in a reliable fashion, the only migrations that may be required will be a result of application growth.

We are continuing to work with the facility as new applications are introduced, to measure the impact of each application and the network's capability to accept the application. If an eventual migration is required for higher capacity, other directions may have to be considered, such as Fast Ethernet or other design architecture modifications such as an ATM backbone. Either way, the client had a requirement to stabilize the facility still using the Token Ring fault-redundant features. These requirements were met through the network baselining exercises. We considered this a successful project, in which the network baseline process was used to troubleshoot, stabilize, and increase the performance in the facility.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset