Convergence

Changes -- planned and unplanned -- are normal in any network:

  • A serial link breaks

  • A new serial link is added to a network

  • A router or hub loses power or malfunctions

  • A new LAN segment is added to a network

All routers in the routing domain will not reflect these changes right away. This is because RIP routers rely on their direct neighbors for routing updates, which in turn rely on another set of neighbors. The routing process that is set into motion from the time of a network change (such as the failure of a link) until all routers correctly reflect the change is referred to as convergence. During convergence, routing connectivity between some parts of the network may be lost and, hence, an important question that is frequently asked is “How long will the network take to converge after such-and-such failure in the network?” The answer depends on a number of factors, including the network topology and the timers that have been defined for the routing protocol.

The following list defines the four timers that are key to the operation of any DV protocol, including RIP:

Update timer (default value: 30 seconds)

After sending a routing update, RIP sets the update timer to 0. When the timer expires, RIP issues another routing update. Thus, RIP updates are sent every 30 seconds.

Invalid timer (default value: 180 seconds)

Every time a router receives an update for a route, it sets the invalid timer to 0. The expiration of the invalid timer indicates that six consecutive updates were missed -- at this time, the source of the routing information is considered suspect. Even though the route is declared invalid, packets are still forwarded to the next hop specified in the routing table. Note that prior to the expiration of the invalid timer RIP would process any updates received by updating the route’s timers.

Hold-down timer (default value: 180 seconds)

When the invalid timer expires, the route automatically enters the hold-down phase. During hold-down, all updates regarding the route are disregarded -- it is assumed that the network may not have converged and that there may be bad routing information circulating in the network. The hold-down timer is started when the invalid timer expires. Thus, a route goes into hold-down state when the invalid timer expires. A route may also go into hold-down state when an update is received indicating that the route has become unreachable -- this is discussed further later in this section.

Flush timer (default value: 240 seconds)

The flush timer is set to when an update is received. When the flush timer expires, the route is removed from the routing table and the router is ready to receive an update with this route. Note that the flush timer overrides the hold-down timer.

Let’s consider Figure 2-3. Here is a snapshot of A’s routing table (when all entities are up):

Three routers connected using Ethernet segments

Figure 2-3. Three routers connected using Ethernet segments

A>sh ip route
...

C       192.168.1.0 is directly connected, Ethernet1
     172.17.0.0/16 is subnetted, 6 subnets
C       172.17.1.9 is directly connected, Ethernet0
C       172.17.250.0 is directly connected, Ethernet1
C       172.17.251.0 is directly connected, Ethernet2
R       172.17.50.0 [120/1] via 172.17.250.2, 0:00:11, Ethernet1
R       172.17.100.0 [120/1] via 172.17.251.2, 0:00:19, Ethernet2
R       172.17.252.0 [120/1] via 172.17.250.2, 0:00:11, Ethernet1
                     [120/1] via 172.17.251.2, 0:00:19, Ethernet2

This table shows that 11 seconds ago A received an update for 172.17.50.0 from 172.17.250.2 (B). The update and invalid timers for a route are reset (set to 0) every time a valid update is received for the route. At the moment this routing-table snapshot was taken, A’s invalid timer for 172.16.50.0 and B’s update timer for 172.16.50.0 would both be 11 seconds.

Let’s say that at this very time, B was disconnected from its LAN attachment to A. A would now stop receiving updates from B. 30 seconds after the cut, the routing table would look like this:

A>sh ip route
...

C       192.168.1.0 is directly connected, Ethernet1
     172.17.0.0/16 is subnetted, 6 subnets
C       172.17.1.9 is directly connected, Ethernet0
C       172.17.250.0 is directly connected, Serial0
C       172.17.251.0 is directly connected, Serial1
R       172.17.50.0 [120/1] via 172.17.250.2, 0:00:41, Serial0
R       172.17.100.0 [120/1] via 172.17.251.2, 0:00:19, Serial1
R       172.17.252.0 [120/1] via 172.17.250.2, 0:00:41, Serial0
                     [120/1] via 172.17.251.2, 0:00:19, Serial1

The invalid timer for 172.16.50.0 is now at 41 seconds. A would still continue to forward traffic for 172.17.50.0 via Ethernet0. The assumption RIP makes is that an update was lost or damaged in transit from B to A, even though the route is still good. This assumption holds good until the invalid timer expires (180 seconds or 6 update intervals from the last update). Before the invalid timer expires, A will receive and process any updates received regarding 172.16.50.0. Once the invalid timer expires, the route is placed in hold-down and subsequent updates about 172.16.0.0 are suppressed under the assumption that the route has gone bad and that bad routing information may be circulating in the network. The route will go into hold-down 180 seconds from the last update, or 169 seconds after the cut. At this time, the routing table would look like this:

A>sh ip route
...

C       192.168.1.0 is directly connected, Ethernet1
     172.17.0.0/16 is subnetted, 6 subnets
C       172.17.1.9 is directly connected, Ethernet0
C       172.17.250.0 is directly connected, Serial0
C       172.17.251.0 is directly connected, Serial1
R       172.17.50.0 is possibly down,
          routing via 172.17.250.2, Serial0
R       172.17.100.0 [120/1] via 172.17.251.2, 0:00:19, Serial1
R       172.17.252.0 [120/1] is possibly down,
          routing via 172.16.250.2, Ethernet1
                     [120/1] via 172.17.251.2, 0:00:19, Serial1

The route remains in hold-down until the hold-down timer expires or until the route is flushed, whichever happens first. Using default timers, the flush timer would go off first, 229 seconds after the cut. Router A would then learn the route to 172.17.50.0 when the next update arrived from C, which could be between and 30 seconds after the route has been flushed, or 229 to 259 seconds from the cut.

The events just described are illustrated in Figure 2-4.

Route convergence after a failure

Figure 2-4. Route convergence after a failure

Speeding Up Convergence

When a router detects that an interface is down, it immediately flushes all routes it knows via that interface. This speeds up convergence, avoiding the invalid, hold-down, and flush timers.

Can you now guess the reason why the case study used earlier (routers A, B, and C connected via Ethernet segments) differs slightly from TraderMary’s network in New York, Chicago, and Ames?

We couldn’t illustrate the details of the invalid, hold-down, and flush timers in TraderMary’s network because if a serial link is detected in the down state, all routes that point through that interface are immediately flushed from the routing table. In our case study, we were able to pull B off its Ethernet connection to A while keeping A up on all its interfaces.

Split horizon

Consider a simple network with two routers connected to each other (Figure 2-5).

Split horizon

Figure 2-5. Split horizon

Let’s say that router A lost its connection to 172.18.1.0, but before it could update B about this change, B sent A its full routing table, including 172.18.1.0 at one hop. Router A now assumes that B has a connection to 172.18.1.0 at one hop, so A installs a route to 172.18.1.0 at two hops via B. A’s next update to B announces 172.18.1.0 at two hops, so B adjusts its route 172.18.1.0 to three hops via A! This cycle continues until the route metric reaches 16, at which stage the route update is discarded.

Split horizon solves this problem by proposing a simple solution: when a router sends an update through an interface, it does not include in its update any routes that it learned via that interface. Using this rule, the only network that A would send to B in its update would be 172.18.1.0, and the only network that B would send to A would be 172.18.2.0. B would never send 172.18.1.0 to A, so the previously described loop would be impossible.

Counting to infinity

Split horizon works well for two routers directly connected to each other. However, consider the following network (shown in Figure 2-6).

Counting to infinity

Figure 2-6. Counting to infinity

Let’s say that router A stopped advertising network X to its neighbors B and E. Routers B, D, and E will finally purge the route to X, but router C may still advertise X to D (without violating split horizon). D, in turn, will advertise X to E, and E will advertise X to A. Thus, the router (C) that did not purge X from its table can propagate a bad route.

This problem is solved by equating a hop count of 16 to infinity and hence disregarding any advertisement for a route with this metric.

In Figure 2-6, when B finally receives an advertisement for X with a metric of 16, it will consider X to be unreachable and will disregard the advertisement. The choice of 16 as infinity limits RIP networks to a maximum diameter of 15 hops between nodes. Note that the choice of 16 as infinity is a compromise between convergence time and network diameter -- if a higher number were chosen, the network would take longer to converge after a failure; if a lower number were chosen, the network would converge faster but the maximum possible diameter of a RIP network would be smaller.

Triggered updates

When a router detects a change in the metric for a route and sends an update to its neighbors right away (without waiting for its next update cycle), the update is referred to as a triggered update. The triggered update speeds convergence between two neighbors by as much as 30 seconds. A triggered update does not include the entire routing table, but only the route that has changed.

Poison reverse

When a router detects that a link is down, its next update for that route will contain a metric of 16. This is called poisoning the route. Downstream routers that receive this update will immediately place the route in hold-down (without going through the invalid period).

Poison reverse and triggered updates can be combined. When a router detects that a link has been lost or the metric for a route has changed to 16, it will immediately issue a poison reverse with triggered update to all its neighbors.

Neighbors that receive unreachability information about a route via a poison reverse with triggered update will place the route in hold-down if their next hop is via the router issuing the poison reverse. The hold-down state ensures that bad information about the route (say from a neighbor that may have lost its copy of the triggered update or may have issued a regular update just before it received the triggered update) does not propagate in the network.

Triggered updates and hold-downs can handle the loss of a route, preventing bad routing information. Why, then, do we need the count-to-infinity limits? Triggered updates may be dropped, lost, or corrupted. Some routers may not ever receive the unreachability information and may inject a path for a route into the network even when that path has been lost. Count to infinity would take care of these situations.

Setting timers

The value of RIP timers on a Cisco router can be seen in the following example:

Chicago>sh ip protocol

Routing Protocol is "rip"
Sending updates every 30 seconds, next due in 24 seconds
Invalid after 90 seconds, hold down 90, flushed after 180

These timers could be modified to allow faster convergence. The following command:

timers basic 10 25 30 40

would send RIP updates every 10 seconds instead of every 30 seconds. The other three timers specify the invalid, hold-down, and flush timers, respectively. These timers can be configured as follows:

NewYork#config
NewYork-config#router rip
NewYork-config#timers basic 10 25 30 40

However, RIP timers should not be modified without a detailed understanding of how RIP works. Potential problems with decreasing the timer values are that updates will be issued more frequently and can cause congestion on low-bandwidth networks, and that congestion in the network is more likely to cause routes to go into hold-down; this, in turn, can cause route flapping.

Warning

Do not modify RIP timers unless absolutely necessary. If you modify RIP timers, make sure that all routers have the same timers.

If an interface on a router goes down, the router sends a RIP request out to the other, up interfaces. This speeds up convergence if any of the other neighbors can reach the destinations that were missed in the first request.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset