Chapter 14. BGP High Availability

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 14. BGP High Availability

The following topics are covered in this chapter:

BGP Graceful-Restart

BGP SSO and Nonstop Routing

BFD

Fast External Failover

Route Dampening

BGP Add-Path

BGP Prefix-Independent Convergence

BGP Graceful-Restart

The BGP Graceful-Restart (GR) feature allows a BGP speaker to express its ability to preserve forwarding state during Border Gateway Protocol (BGP) restart or Route Processor (RP) switchover. In other words, it is the capability exchanged between the BGP speakers to indicate its ability to perform Nonstop Forwarding (NSF). This helps in minimizing the impact of services caused by BGP restart. Specially in large network deployments, where BGP carries large number of prefixes, a BGP restart, especially by a route-reflector (RR) router, can have a severe performance and service impact and can lead to major outages.

Examine the network topology shown in Figure 14-1. R1 is acting as the RR and its peering with multiple clients. If there is a BGP restart or RP switchover on R1, the peer detects the session flaps and propagate routing updates throughout the network. This can lead to increased CPU utilization if the RR is holding a large BGP table. The traffic destined to the prefixes that were removed are impacted.

Figure 14-1 Impact of Node Failure in a Network with BGP Route Reflectors

RFC 4724 defines the GR mechanism for BGP. The BGP GR was developed with the following motivations:

Avoid widespread routing changes.

Decrease control plane overhead throughout the network.

Enhance overall stability of routing.

A GR-capable device announces its ability to perform GR for the BGP peer. It also initiates the graceful-restart process when a RP switchover occurs and acts as a GR-aware device. A GR-aware device, also known GR helper mode, is capable of understanding that a peer router is transitioning and takes appropriate actions based on the configuration or default timers.

GR capability should always be enabled for all routing protocols, especially when the routers are running with dual route processors (RP) and perform a switchover in case of any failure instance. Because BGP runs on Transmission Control Protocol (TCP), GR should be enabled on both the peering devices. After GR is configured or enabled on both peering devices, reset the BGP session to exchange the capability and activate the GR feature.

Note

GR is always on by default for non-TCP–based protocols such as Interior Gateway Protocol (IGPs). These protocols start operating in GR mode as soon as the other side is configured with GR capability.

BGP GR is an optional feature and is not enabled by default. BGP peers announce GR capability in the BGP OPEN message. Within the OPEN message, the following information is negotiated:

Restart Flag: This bit indicates if a peer sending the GR capability has just restarted. This is used to prevent deadlocks if both peers restart at the same time.

Restart Time: Indicates the length of time that the sender of the GR capability requires to complete a restart. The restart timer also helps in speeding up convergence in the event the peer never comes back up after a restart.

Address-Family Identifier (AFI)/Subaddress-Family Identifier (SAFI): Address-family for which GR is supported.

AFI Flags: It contains a Forwarding State bit. This bit indicates whether the peer sending the GR capability has preserved forwarding during the previous restart.

Peers can include GR capability without including any address-families. This implies GR awareness (nonrestarting support for GR) without the ability to perform a GR.

When a BGP restart happens on the peer router or when RP switchover occurs, the routes currently held in the forwarding table; that is, hardware, are marked as stable. This way, the forwarding state is preserved as the control plane and the forwarding plane operate independently. On the restarting peer (where the switchover occurred), BGP on the newly active RP starts to establish sessions with all the configured peers. BGP on the other side, the nonrestarting side, sees new connection requests coming in while BGP already is in established state. Such an event is an indication for the nonrestarting peer that the peer has restarted. At this point, the restarting peer sends the GR capability with Restart State bit set to 1 and Forwarding State bit set to 1 for the AFI/SAFIs.

The nonrestarting peer at this point cleans up old (dead) BGP sessions and marks all the routes in the BGP table that are received from the restarting peer as stale. If the restarting peer never reestablishes the BGP session, the nonrestarting peer purges all stale routes after the Restart Time expires. The nonrestarting peer sends an initial routing table update, followed by an End-of-RIB (EoR) marker. Restarting peer delays best-path calculation for an AFI until after receiving EoR from all peers except for those that are not GR capable or for the ones that have Restart State bit set.

The restarting peer finally generates updates for its peers and sends the EoR marker for each AFI after the initial table is sent. The nonrestarting peers receive the routing updates from the restarting peer and remove stale marking for any refreshed route. It purges any remaining stale routes after EoR is received from the restarting peer or the Stale Path Timer expires.

GR can be configured both globally or on a per neighbor basis. Use the command bgp graceful-restart to enable GR globally. Example 14-1 demonstrates the global configuration of GR on Cisco IOS, IOS XR, and NX-OS platforms. Use the command bgp graceful-restart restart-time value to set the GR restart timer and the command bgp graceful-restart stalepath-time value to set the maximum time for which the router will maintain the stale path entries in case it does not receives an EoR from the restarting peer. In IOS XR, the command bgp graceful-restart stalepath-timer sets the maximum time to wait for restart of GR capable peers and a new command is introduced to take care of purging the stale paths from the peer—bgp graceful-restart purge-time value.

Example 14-1 Global Configuration for Graceful-Restart

Table of Contents for Chapter 14. BGP High Availability

Create new playlist

Sign In

Sign Up

Chapter 14. BGP High Availability

BGP Graceful-Restart

BGP Nonstop Routing

Bidirectional Forwarding Detection

Asynchronous Mode

Asynchronous Mode with Echo Function

Configuration and Verification

Troubleshooting BFD Issues

BFD Session Not Coming Up

BFD Session Flapping

BGP Fast-External-Fallover

BGP Add-Path

BGP best-external

BGP FRR and Prefix-Independent Convergence

BGP PIC Core

BGP PIC Edge

Scenario 1—IP PE-CE Link/Node Protection on CE Side

Scenario 2—IP MPLS PE-CE Link/Node Protection for Primary/Backup

PE-CE Link Failure

PE Node Failure

BGP Recursion Host

Summary

References

Table of Contents for
Chapter 14. BGP High Availability