Chapter 18. MLAG

Multichassis Link Aggregation (MLAG) is the open-standard (and thus, Arista) term for linking a port-channel or Link Aggregation Group (LAG) to multiple switches instead of just one. The technology accomplishes the same basic goal as Cisco’s Virtual Port Channel (vPC), although, in my experience, MLAG is simpler to configure and more forgiving when used.

MLAG Overview

As just mentioned, the acronym LAG is an abbreviation for Link Aggregation Group, which is an open-standard way of describing the bonding of multiple physical links into a single logical link. In Cisco parlance, this technology is called Etherchannel. Different vendors use different terms for similar solutions, but the term LAG has become a cross-vendor acceptable way of describing the idea. Why would you want to do this? Let’s take a look.

With a traditional network design, interconnecting three switches at Layer 2 (L2) results in a loop. Loops are bad, so Spanning Tree Protocol (STP) blocks the interface on the link farthest from the root. Figure 18-1 shows an example of this.

Traditional STP-blocked network loop
Figure 18-1. Traditional STP-blocked network loop

In this scenario, there is a LAG connecting switch A to switch B. Switch C connects to both A and B switches, forming a loop. STP has blocked the interface on switch C that leads to switch B in order to break said loop. This design will allow for failover if the link between switches A and C were to fail, but the failover can take 30 seconds or more (substantially less if rapid STP is used). Not only that, but only one-half of the available bandwidth to and from switch C is available for use. Wouldn’t it be nice if we could use that extra link? Even better, if we used LAG technology, a single link failure wouldn’t incur an outage because the second link would already be active.

With MLAG, two Arista switches fool the third switch (or any other Link Aggregation Control Protocol [LACP]–capable device) into thinking that it is connected to a single device. In other words, two Arista switches appear to be one Arista switch to LACP, as shown in Figure 18-2.

Simple MLAG design
Figure 18-2. Simple MLAG design

With MLAG active using two 10 Gbps links, switch C sees a 20 Gbps logical interface to a single device, even though it is connected to two devices. Arista accomplishes this feat by advertising the same chassis-ID from both switch A and switch B. To do this, switch A and switch B must communicate over the A–B switch link, which must be configured with a VLAN that acts as a peer-link.

MLAG is configured within something called an MLAG Domain. The MLAG Domain ID identifies the switch to another switch that will share MLAGs. Let’s go ahead and build an MLAG pair.

Configuring MLAG

The first thing we need to do is make sure that both MLAG peers are on the same revision of code. Will it work if the switches have different code? Yes, but in a more limited fashion than it used to. Today if you try to peer incompatible revs of EOS, the switches will report an error and refuse to peer. I’m not a fan of this, but from a TAC point of view, it greatly lowers the number of possible permutations that they need to support. To determine what versions are compatible with what versions, look in the release notes.

This isn’t as bad as it might seem at first glance. Looking at the table illustrated in Figure 18-3, EOS 4.21.1F will pair with EOS 4.14.16M, so that’s not so bad. The deal is generally that the last or most current revision of code will be supported from each major release, so if you’re upgrading from 4.14.5M to 4.21.1F, you won’t need to go from 4.14 to 4.15 to 4.16, and so on. You will need to go from 4.15.5M to 4.15.16M, and from there you can go to 4.21.1F. Note also that this is strictly an MLAG compatibility issue and has nothing to do with upgrading standalone switches. You can absolutely upgrade right from 4.14.5M straight to 4.21.1F on a single switch that is not part of an MLAG pair. I’d test that in a lab environment first to see how it might affect your environment, but there is no limitation in the software outside of MLAG.

EOS 4.21.1F MLAG ISSU compatibility matrix
Figure 18-3. EOS 4.21.1F MLAG ISSU compatibility matrix

You can check the MLAG ISSU compatibility between an installed image and another image local to the switch by using the show malg issu compatibility image command. Let’s look at an example of one that passes the test and one that does not. First, here’s the revision of code running on my switch:

Arista#sho ver | grep image
Software image version: 4.19.10M

The switch (a 7280R) is running EOS version 4.19.10M. Here are all of the EOS images that I have stored on flash:

Arista#dir EOS*
Directory of flash:/EOS*

       -rwx   613330599            Oct 8  2018  EOS-4.19.10M.swi
       -rwx   638234211           Mar 20 01:57  EOS-4.20.1F.swi
       -rwx   700978970           Nov 12  2018  EOS-4.21.1F.swi

3269361664 bytes total (85319680 bytes free)

First, I’m going to run the check against version 4.20.1F:

Arista#sho mlag issu compatibility flash:EOS-4.20.1F.swi
/mnt/flash/EOS-4.20.1F.swi (4.20.1F) is MLAG ISSU incompatible with
the current image (4.19.10M). A reboot with this image may cause
packet loss. Please consult the release notes to find a compatible
image.
The new image is compatible with these releases, which may also be
compatible with the current version:
EOS-4.16.6M-INT
EOS-4.16.6M
EOS-4.16.7FX-MLAGISSU-TWO-STEP
EOS-4.16.7M
EOS-4.16.8M
EOS-4.16.8FX-MLAGISSU-TWO-STEP
EOS-4.16.9M
EOS-4.16.10M
EOS-4.16.11M
EOS-4.16.12M
EOS-4.16.13M
EOS-4.16.13FX-MLAGISSU-TWO-STEP
EOS-4.17.0F-INT
EOS-4.17.0F
EOS-4.17.1F
EOS-4.17.1.1FX-MDP-INT
EOS-4.17.2F
EOS-4.17.3F
EOS-4.17.4M
EOS-4.17.5M
EOS-4.17.6M
EOS-4.17.7M
EOS-4.18.0F
EOS-4.18.0F-INT
EOS-4.18.1.1F
EOS-4.18.2.1F
EOS-4.18.2-REV2-FX
EOS-4.18.3.1F
EOS-4.18.4F
EOS-4.18.4.2F
EOS-4.18.5M
EOS-4.19.0F
EOS-4.19.1F
EOS-4.19.2F
Arista#

Well, that certainly threw a lot of output! As I tell network engineers all the time, when you see a page of output, step away from the keyboard and read it! In this case, when I first encountered this output, I had to read it probably six times before it sunk in due to an odd bit of technically accurate grammar. Let me run that command again using grep to include only the lines that I want to highlight:

Arista#sho mlag issu compatibility flash:EOS-4.20.1F.swi | grep -A3 incompatible  
/mnt/flash/EOS-4.20.1F.swi (4.20.1F) is MLAG ISSU incompatible with
the current image (4.19.10M). A reboot with this image may cause
packet loss. Please consult the release notes to find a compatible
image.

The phrase that trips me up is “is MLAG ISSU incompatible.” I would probably prefer something like “is NOT MLAG ISSU compatible,” or something even simpler like, “Nope—WON’T WORK!” but then I suppose that’s why I’m not a developer. When running this command, I look for the list of versions, because if the output of the command spits out a long list of versions, it’s telling you that you should probably use one of those instead of the one you tried.

Let’s try that with a different version:

Arista#sho mlag issu compatibility flash:EOS-4.21.1F.swi
/mnt/flash/EOS-4.21.1F.swi (4.21.1F) is MLAG ISSU compatible with the
current image (4.19.10M).

Not only does it say, “is MLAG ISSU compatible,” but there is a notable absence of suggested versions to try instead of the one I checked against. This means that we’re good to go. For the rest of the chapter, both of my switches will be running 4.21.1F, so let’s build a simple MLAG setup using the network shown in Figure 18-4.

MLAG network detail
Figure 18-4. MLAG network detail

We need to create a peer-link over which the two switches can communicate. This link can be a single link, but for redundancy, it should always be a port-channel containing a minimum of two physical links. In this example, there are two 24-port switches, so let’s use the last two interfaces, e47 and e48:

Arista-A(config)#int e47-48
Arista-A(config-if-Et47-48)#channel-group 1000 mode active

Next, we configure the port-channel to be a trunk:

Arista-A(config-if-Et47-48)#int po 1000
Arista-A(config-if-Po1000)#switchport mode trunk

If you’re used to Cisco switches, you’ll notice that the switch did not bark at us about trunk encapsulation. Here’s what would happen on a Cisco switch:

Cisco-1(config)#int f1/0/7
Cisco-1(config-if)#switchport mode trunk
Command rejected: An interface whose trunk encapsulation is "Auto"
can not be configured to "trunk" mode.

Arista does not negotiate trunk encapsulation, because it supports only dot1q trunks. Older Cisco switches also support Inter-Switch Link (ISL), which is a Cisco proprietary protocol. But enough of my attention deficit issues; let’s continue.

Notice that there is absolutely nothing special about this link. It is a port-channel running as a trunk. This is not an MLAG; rather, it’s the link used to connect the two peers and, as such, is called the peer-link.

With the port-channel configured as a trunk, we need to create a VLAN that will be used only for MLAG peer-to-peer communication. The Arista examples use VLAN 4094, so let’s keep that tradition alive:

Arista-A(config)#vlan 4094
Arista-A(config-vlan-4094)#trunk group MLAG-Peer

The trunk group MLAG-Peer command creates a trunk group. A trunk group is a sort of inclusion (or exclusion depending on your point of view) group. When you create a trunk, all VLANs are included on that trunk by default unless you specify otherwise. When we put a VLAN into a trunk group, that VLAN is no longer included in trunks by default. As a result, we now need to assign the same group to the peer-link in order to include that VLAN:

Arista-A(config-vlan-4094)#int po 1000
Arista-A(config-if-Po1000)#switchport trunk group MLAG-Peer

VLAN 4094 will be included only on trunks that are also assigned to the MLAG-Peer trunk group. By doing this, when we create a new trunk, by default VLAN 4094 will not be included. This keeps the MLAG peer-link traffic on this link, and only on this link (unless you add the MLAG-Peer trunk group to another trunk, but don’t do that).

The trunk group names for the peer VLAN should be configured to be the same on both switches. Although they are locally significant, do yourself a favor and keep them the same on the two peers. The configuration for VLANs and VLAN trunk groups must be identical in order to successfully establish an MLAG association between two switches.

Now that we know this VLAN is limited to the peer-link, we can disable spanning-tree on the VLAN:

Arista-A(config)#no spanning-tree vlan 4094

Note that this is a global command, and not an interface command. It will fail with an % Incomplete command message if run from interface configuration mode because the same syntax is used to set cost and port priority there.

Because Multiple Spanning Tree (MST) is the default on Arista switches, and MST is not VLAN based, this command will not have the same result that it would if Rapid-PVST (RPVST) were enabled. It is still a best practice to disable Spanning Tree from the MLAG peer VLAN in case RPVST is ever enabled.

Note

Disabling STP is almost always a bad idea. In this case, the MLAG peer-link always needs to be up in order to prevent a split-brain scenario. Because the peer-link is using a trunk group, a loop on this VLAN should never occur. The only way a loop could possibly occur would be (in this example) for the MLAG-Peer trunk group to be included on other links from the MLAG pair. So don’t do that. Ever.

With the physical link and trunk set up, we’re now going to make a Layer 3 (L3) connection between the two switches, as shown in Figure 18-5.

Because MLAG peers communicate with each other over L3, we must assign an IP address to the VLAN on each side:

Arista-A(config)#int vlan 4094
Arista-A(config-if-Vl4094)#ip address 10.255.255.1/30
Arista-A(config-if-Vl4094)#no autostate

The no autostate command keeps the L3 Switch Virtual Interface (SVI) interface up regardless of whether there are any interfaces active in the VLAN.

Now, we must configure MLAG itself:

Arista-A(config)#mlag
Arista-A(config-mlag)#local-interface vlan 4094
Arista-A(config-mlag)#peer-address 10.255.255.2
Arista-A(config-mlag)#peer-link port-channel 1000
Arista-A(config-mlag)#domain-id Arista-AB

The commands should be relatively obvious. We’ve assigned the MLAG local interface to be the VLAN SVI we just created (VLAN 4094); we’ve told the switch that the peer for this MLAG domain is at the IP address 10.255.255.2; the peer-link is riding over port-channel 1000; and the MLAG domain ID is Arista-AB (I try to make the domain ID somehow relate to both switch hostnames).

At this point the MLAG peers look like what is shown in Figure 18-6.

MLAG peers with domain ID
Figure 18-6. MLAG peers with domain ID

The domain ID is the means whereby the switch differentiates different MLAG groups. I show this in more detail later in this chapter. The MLAG domain ID is case-sensitive and must match on both sides.

At this point, the status of the peer-link should be connected. This can be shown with the command show mlag:

Arista-A#sho mlag
MLAG Configuration:
domain-id           :           Arista-AB
local-interface     :            Vlan4094
peer-address        :        10.255.255.2
peer-link           :    Port-Channel1000
peer-config         :          consistent
                                        
MLAG Status:      
state               :              Active
negotiation status  :           Connected
peer-link status    :                  Up
local-int status    :                  Up
system-id           :   2a:99:3a:06:6f:37
                                        
MLAG Ports:        
Disabled            :                   0
Configured          :                   0
Inactive            :                   0
Active-partial      :                   0
Active-full         :                   0

The last section that begins with MLAG Ports shows all zeroes because we have not created any MLAG interfaces yet, so let’s go ahead and create a simple MLAG.

To reiterate, here are the relevant MLAG configurations for both Arista-A and Arista-B:

  ------------------------------------- -------------------------------------
 | Arista-A                            | Arista-B                            |
  ------------------------------------- -------------------------------------
 | vlan 4094                           | vlan 4094                           |
 |    trunk group MLAG-Peer            |    trunk group MLAG-Peer            |
 | !                                   | !                                   |
 | interface Port-Channel1000          | interface Port-Channel1000          |
 |    description [ MLAG Peer-Link ]   |    description [ MLAG Peer-Link ]   |
 |    switchport mode trunk            |    switchport mode trunk            |
 |    switchport trunk group MLAG-Peer |    switchport trunk group MLAG-Peer |
 | !                                   | !                                   |
 | interface Ethernet47                | interface Ethernet47                |
 |    description [ MLAG Peer ]        |    description [ MLAG Peer ]        |
 |    channel-group 1000 mode active   |    channel-group 1000 mode active   |
 | !                                   | !                                   |
 | interface Ethernet48                | interface Ethernet48                |
 |    description [ MLAG Peer ]        |    description [ MLAG Peer ]        |
 |    channel-group 1000 mode active   |    channel-group 1000 mode active   |
 | !                                   | !                                   |
 | interface Vlan4094                  | interface Vlan4094                  |
 |    description [ MLAG Link ]        |    description [ MLAG Link ]        |
 |    no autostate                     |    no autostate                     |
 |    ip address 10.255.255.1/30       |    ip address 10.255.255.2/30       |
 | !                                   | !                                   |
 | mlag configuration                  | mlag configuration                  |
 |    domain-id Arista-AB              |    domain-id Arista-AB              |
 |    local-interface Vlan4094         |    local-interface Vlan4094         |
 |    peer-address 10.255.255.2        |    peer-address 10.255.255.1        |
 |    peer-link Port-Channel1000       |    peer-link Port-Channel1000       |
 |                                     |                                     |
  ------------------------------------- -------------------------------------

By the way, if you think that side-by-side output is cool, that’s from an eAPI script I wrote that allows me to compare any command from any two Arista switches, provided they’re running eAPI (and I have the passwords, of course). I use this in my classes all the time for troubleshooting. To learn more about eAPI, see Chapter 30.

In this example, I’ve set up two Arista switches (Arista-A and Arista-B) connected to a third Arista switch that’s been cleverly named Arista-C. The first two Arista switches will be forming an MLAG peer, while the C switch will view the link as a regular port-channel. Figure 18-7 depicts how the network looks before we continue.

Third switch connected via port-channel to MLAG pair
Figure 18-7. Third switch connected via port-channel to MLAG pair

Take careful note that everything we’re doing on Arista-C has nothing to do with the MLAG configurations on the two MLAG peers (Arista A and B). This is a very important distinction because nothing MLAG-related “escapes” the MLAG domain. There is no MLAG negotiation outside of the two peers! The only thing Arista-C will see coming from the MLAG peers is LACP.

To further prove that point, here’s how I’ve configured Arista-C:

Arista-C(config)#int e7-8
Arista-C(config-if-Et41-42)#channel-group 999 mode active

That’s it! This switch has absolutely nothing to do with MLAG and has no idea that MLAG is in the mix. The only thing it sees is LACP. To Arista-C, the two MLAG peers appear to be a single chassis.

This forms a simple port-channel (Po999) comprising the physical links, Et7 and Et8. All ports are 10 Gbps. The port-channel will use the LACP protocol due to the mode active keywords in the channel-group commands.

The problem with the network configuration as it stands is that one of the interfaces in the triangle of network connections will be error-disabled. This is not due to Spanning Tree, but rather LACP on Arista-C, which will receive two different chassis-IDs on E47 and E48. Because those two interfaces are bonded together in a port-channel on Arista-C, LACP expects the remote devices to be a single device. To make that happen, we need to configure the two MLAG peers (Arista A and B) to do that. Luckily, this step is really quite simple.

First, all ports to be bonded between MLAG peers must be in a port-channel. You cannot bond physical interfaces, even (as is the case here) if there is only one on each physical switch. Therefore, the first thing we need to do is to put the physical interface on each MLAG peer into a port-channel:

Arista-A(config)#int e33
Arista-A(config-if-Et1)#channel-group 1 mode active

You must do this on both MLAG peers:

Arista-B(config)#int e33
Arista-B(config-if-Et1)#channel-group 1 mode active

Do the interfaces and port-channel numbers need to match? No, but do yourself a favor and make them match.

Note

I strongly urge you to keep the port-channel assignments the same on the MLAG peers. I’ve worked on installations where the MLAG peers shared an MLAG using different port-channel interfaces, and it was a nightmare to debug during an outage. Keep it simple, and you’ll keep your job.

Now we need a way to bond these two port-channels together across the MLAG pair. To do that, we configure the port-channel itself and apply an MLAG number to the port-channel:

Arista-A(config-if-Et1)#int po 1
Arista-A(config-if-Po1)#mlag 1

And again, we must do this on both of the MLAG peers:

Arista-B(config-if-Et1)#int po 1
Arista-B(config-if-Po1)#mlag 1

That’s it! After all the peer-link stuff is done and the MLAG adjacency is formed, the creation and linking of port-channels is really all that needs to be done from a daily moves-adds-changes perspective. Figure 18-8 illustrates what we’ve built.

It is important to remember that, logically, Figure 18-9 shows how switch C sees the network with MLAG enabled on switches A and B. At this point, switch C has no idea that switches A and B are two different devices, at least so far as LACP is concerned. This is a very important thing to understand because at L3, there are still three devices in the mix. I’m not going to go into a lot of detail on that right now, but remember that MLAG is almost exclusively an L2 thing.

MLAG interface added to Arista-A and Arista-B
Figure 18-8. MLAG interface added to Arista-A and Arista-B
How switch C sees the network with MLAG enabled
Figure 18-9. How switch C sees the network with MLAG enabled

Managing MLAG

To see the status of individual MLAG interfaces, use the show mlag interfaces command:

Arista-A(config)#sho mlag int
                                                                 local/remote
   mlag     desc                   state    local      remote          status
---------- ---------------- --------------- --------- --------- -------------
      1     [ Arista-C ]     active-full      Po1         Po1           up/up

Here is the output of the same switch with three configured MLAGs, of which only one is active:

Arista-A#sho mlag int
                                                                 local/remote
   mlag       desc             state       local       remote          status
---------- ---------------- --------------- --------- --------- -------------
      1     [ Arista-C ]     active-full      Po1         Po1           up/up
      3                      inactive         Po3         Po3       down/down
      5                      inactive         Po5         Po5       down/down

If MLAG is active, but the peer’s link (not the peer-link!) is down for whatever reason, the status of the MLAG will be Active-partial:

Arista-A#sho mlag int
                                                                 local/remote
   mlag       desc             state       local       remote          status
---------- ---------------- --------------- --------- --------- -------------
      1     [ Arista-C ]     active-partial   Po1         Po1         up/down

Here is the same output from the peer with the interface that’s down. Check out the local/remote status and how its flipped from the other side because the local interface is always shown first:

Arista-B#sho mlag int
                                                                 local/remote
   mlag       desc             state       local       remote          status
---------- ---------- -------------------- ----------- ------------ ------------
      1     [ Arista-C ]     active-partial   Po1         Po1         down/up

By the way, if you encounter a scenario in which someone has used nonmatching port-channel and MLAG numbers, the show mlag interfaces command will be where you’d look to figure it out. Also, smack them in the back of the head for doing that. It’s OK. They probably deserve it.

The output of show mlag shows you totals as opposed to specific interface information. In this case there is one configured MLAG interface that is active-partial:

Arista-A#sho mlag
MLAG Configuration:  
domain-id              :           Arista-AB
local-interface        :            Vlan4094
peer-address           :        10.255.255.2
peer-link              :    Port-Channel1000
peer-config            :        inconsistent
                                            
MLAG Status:          
state                  :              Active
negotiation status     :           Connected
peer-link status       :                  Up
local-int status       :                  Up
system-id              :   2a:99:3a:06:6e:0f
dual-primary detection :            Disabled
                                            
MLAG Ports:          
Disabled               :                   0
Configured             :                   0
Inactive               :                   0
Active-partial         :                   1
Active-full            :                   0

To get some more detail regarding the state of MLAG in general, use the show mlag detail command:

Arista-A#sho mlag detail
MLAG Configuration:  
domain-id              :           Arista-AB
local-interface        :            Vlan4094
peer-address           :        10.255.255.2
peer-link              :    Port-Channel1000
peer-config            :        inconsistent
                                            
MLAG Status:          
state                  :              Active
negotiation status     :           Connected
peer-link status       :                  Up
local-int status       :                  Up
system-id              :   2a:99:3a:06:6e:0f
dual-primary detection :            Disabled
                                            
MLAG Ports:          
Disabled               :                   0
Configured             :                   0
Inactive               :                   0
Active-partial         :                   1
Active-full            :                   0

MLAG Detailed Status:
State                           :             primary
Peer State                      :           secondary
State changes                   :                   4
Last state change time          :         0:29:21 ago
Hardware ready                  :                True
Failover                        :               False
Last failover change time       :               never
Secondary from failover         :               False
Peer MAC address                :   28:99:3a:06:6e:ed
Peer MAC routing supported      :                True
Reload delay                    :         300 seconds
Non-MLAG reload delay           :         300 seconds
Peer ports errdisabled          :               False
Lacp standby                    :               False
Configured heartbeat interval   :             4000 ms
Effective heartbeat interval    :             4000 ms
Heartbeat timeout               :            60000 ms
Last heartbeat timeout          :               never
Heartbeat timeouts since reboot :                   0
UDP heartbeat alive             :                True
Heartbeats sent/received        :         22003/22004
Peer monotonic clock offset     :   24.884958 seconds
Agent should be running         :                True
P2p mount state changes         :                   1
Fast MAC redirection enabled    :               False

MLAG Consistency

You might have noticed that the output of show mlag and show mlag detail in the previous examples showed that the configuration was inconsistent. If you’ve ever used Cisco’s vPC, you know that it can be a bit finicky if things aren’t configured properly. In the early days of the Cisco Nexus, I went through some pretty terrible outages due to this behavior, so I was pretty happy to discover that Arista does not enforce config sanity. Of course, that can be a pretty sharp double-edged sword, but thankfully Arista has included the means to check config sanity between your MLAG peers using the show mlag config-sanity command.

Arista-A#show mlag config-sanity
No global configuration inconsistencies found.

Interface configuration inconsistencies:
    Feature           Attribute   Interfaces  Local value  Peer value
----------- -------------------  ----------- ------------  ----------
   bridging   access-vlan mlag1          Po1          100           -

Somewhere along the line, someone (I bet it was me) configured the MLAG’d port-channel to have the switchport access vlan 100 command on one side but not the other. Sure enough, here’s Arista-A:

Arista-A#sho run int po 1
interface Port-Channel1
   description [ Arista-C ]
   switchport access vlan 100
   mlag 1

Here’s Arista-B:

Arista-B#sho run int po 1
interface Port-Channel1
   description [ Arista-C ]
   mlag 1

After banging my head on the desk to celebrate my stupidity I removed the offending command and checked again:

Arista-A(config)#int po 1
Arista-A(config-if-Po1)#no switchport access vlan
Arista-A(config-if-Po1)#sho mlag config-sanity
No global configuration inconsistencies found.

No per interface configuration inconsistencies found.

If you’d prefer to see all the sanity information instead of only what’s wrong, you can tack the all keyword on the end of the command. Be prepared for some verbosity, though:

Arista-A(config-if-Po1)#sho mlag config-sanity all
Global configuration:
   Feature                        Attribute       Local value         Peer value
------------ ------------------------------ ----------------- ------------------
  bridging               admin-state vlan 1            active             active
  bridging               admin-state vlan 5            active             active
  bridging             admin-state vlan 100            active             active
  bridging            admin-state vlan 4094            active             active
  bridging  MLAG-Peer trunk-group vlan 4094              True               True
       lag               lacp port-id range             [0,0]              [0,0]
       lag             lacp system-priority             32768              32768
      mlag              dual-primary-action    dualPriActNone     dualPriActNone
      mlag     dual-primary-detection-delay                 0                  0
      mlag               heartbeat-interval              4000               4000
      mlag         peer-mac-routing-enabled             False              False
      mlag                     reload-delay                 0                  0
      mlag           reload-delay lacp mode             False              False
      mlag            reload-delay non-mlag                 0                  0
       stp               4094 disabled-vlan              True               True
       stp    bpduguard rate-limit interval                 0                  0
       stp                 bridge assurance              True               True
       stp                     forward-time                15                 15
       stp                       hello-time              2000               2000
       stp                        loopguard             False              False
       stp                          max-age                20                 20
       stp                         max-hops                20                 20
       stp                             mode              mstp               mstp
       stp                  mst pvst border             False              False
       stp                portchannel guard             False              False
       stp              portfast bpdufilter             False              False
       stp               portfast bpduguard             False              False
       stp              transmit hold-count                 6                  6

Interface configuration:
    Feature                  Attribute    Interfaces    Local value   Peer value
------------- ------------------------ ------------- -------------- ------------
   bridging          access-vlan mlag1           Po1              -            -
   bridging      switchport-mode mlag1           Po1              -            -
   bridging   trunk-allowed vlan mlag1           Po1              -            -
   bridging    trunk-native-vlan mlag1           Po1              -            -
        lag        lacp fallback mlag1           Po1           none         none
        lag             lag mode mlag1           Po1           lacp         lacp
      vxlan            100 vlan-to-vni           Vx1          10100        10100
      vxlan       arp-ip-address-local           Vx1          False        False
      vxlan            multicast-group           Vx1        0.0.0.0      0.0.0.0
      vxlan           source-interface           Vx1      Loopback1    Loopback1
      vxlan                   udp-port           Vx1           4789         4789

I recommend that the show mlag config-sanity command be the last step in any change-controls that involve MLAG. It can’t hurt, and it might just save your job.

MLAG Failover

I’ve showed you how to configure MLAG, but I haven’t really explained how it all works. Let’s fix that.

When you connect two Arista switches by configuring MLAG, the two peers negotiate by comparing their system MAC addresses. The one with the lower system MAC address will be the winner and become the primary member of the pair. The loser (there are no trophies for second place in MLAG!) becomes the secondary peer. You cannot force this or override this behavior.

The primary peer “owns” the MLAG peer and is responsible for a bunch of stuff (technical term, that) that we’ll get to, but for now understand that the winner of the negotiation uses its system MAC address as the MLAG System ID. The MLAG System ID (MSI) is then used as the chassis-ID that’s sent to the devices that connect to the MLAG domain using port-channels. This is how MLAG tricks those devices into thinking that there is a single device: send an agreed-upon common chassis-ID from two different physical switches.

You can see the system MAC address by using the show version command:

Arista-A#sho ver | grep MAC
System MAC address:  2899.3a06.6e0f

You can see the MLAG System ID by using show mlag:

Arista-A#sho mlag
MLAG Configuration:  
domain-id              :           Arista-AB
local-interface        :            Vlan4094
peer-address           :        10.255.255.2
peer-link              :    Port-Channel1000
peer-config            :          consistent
                                            
MLAG Status:          
state                  :              Active
negotiation status     :           Connected
peer-link status       :                  Up
local-int status       :                  Up
system-id              :   2a:99:3a:06:6e:0f
dual-primary detection :            Disabled
                                            
MLAG Ports:          
Disabled               :                   0
Configured             :                   0
Inactive               :                   0
Active-partial         :                   1
Active-full            :                   0

If you have an eagle-eye, you might have noticed that the two addresses are actually one bit off. Here I put them next to each other with the system-id slightly reformatted so that the numbers line up:

2899.3a06.6e0f
2a99:3a06:6e0f

Why this change? This has to do with the fact that the MLAG System ID remains unless both MLAG peers reboot or deconfigure MLAG. This can lead to a rare problem that goes something like this:

Imagine two switches in an MLAG pair. We’ll call them A and B. The two switches have the system MAC addresses of aaaa:aaaa:aaaa and bbbb:bbbb:bbbb, respectively. Because the lower MAC address wins, the MLAG System ID (MSI) becomes aaaa:aaaa:aaaa. Now, imagine that the primary switch fails. The secondary switch takes over (more about that in a bit), but the MSI is still aaaa:aaaa:aaaa. Now, let’s further imagine that switch A has failed and gets sent back to Arista for service. The new switch that takes its place has a system MAC address of cccc:cccc:cccc. When it joins the MLAG pair, the MSI is not renegotiated, and the new switch becomes the secondary in the pair. The MSI is still aaaa:aaaa:aaaa, even though that switch doesn’t physically exist in the pair.

Here’s where things get weird.

Suppose that the original switch A is returned and is then put in the network, not as one of the original peers, but as a third switch to be connected to the MLAG peers. This new third switch that we’ll call Switch C has a system MAC address of aaaa:aaaa:aaaa. Can you see the problem? Switch C (which used to be switch A, remember) will now try to connect to the MLAG pair using LACP, but LACP will see switch C’s own chassis-ID coming in from the MLAG pair and will err-disable the port.

To prevent this scenario from happening, the MLAG System ID is actually a derivation of the system-MAC address, and to accomplish that, the MSI is the winner’s system MAC address with the locally administered bit set. From Wikipedia:

The second bit of the first byte of a MAC address determines the type of OUI. If the bit is 0 then it is an OUI globally assigned by the IEEE; if the bit is 1 then it is a locally administered MAC address.

To see if your peer is primary or secondary, use the show mlag detail command, which also shows you the MLAG System ID:

Arista-A#sho mlag detail
MLAG Configuration:  
domain-id              :           Arista-AB
local-interface        :            Vlan4094
peer-address           :        10.255.255.2
peer-link              :    Port-Channel1000
peer-config            :          consistent
                                            
MLAG Status:          
state                  :              Active
negotiation status     :           Connected
peer-link status       :                  Up
local-int status       :                  Up
system-id              :   2a:99:3a:06:6e:0f
dual-primary detection :            Disabled
                                            
MLAG Ports:          
Disabled               :                   0
Configured             :                   0
Inactive               :                   0
Active-partial         :                   1
Active-full            :                   0

MLAG Detailed Status:
State                           :             primary
Peer State                      :           secondary
State changes                   :                   4
Last state change time          :        22:05:12 ago
Hardware ready                  :                True
Failover                        :               False
Last failover change time       :               never
Secondary from failover         :               False
Peer MAC address                :   28:99:3a:06:6e:ed
Peer MAC routing supported      :                True
Reload delay                    :         300 seconds
Non-MLAG reload delay           :         300 seconds
Peer ports errdisabled          :               False
Lacp standby                    :               False
Configured heartbeat interval   :             4000 ms
Effective heartbeat interval    :             4000 ms
Heartbeat timeout               :            60000 ms
Last heartbeat timeout          :               never
Heartbeat timeouts since reboot :                   0
UDP heartbeat alive             :                True
Heartbeats sent/received        :         41441/41441
Peer monotonic clock offset     :   24.456001 seconds
Agent should be running         :                True
P2p mount state changes         :                   1
Fast MAC redirection enabled    :               False

Because the output of show mlag detail is so verbose, I’m paring that output down in various ways from this point on because during failover scenarios, it’s used a lot, and I don’t want this book to be 700 pages. You’re welcome.

Again, there is no way to force one side to be primary short of rebooting the primary switch in order to force a failover. For someone like me who likes all of the devices on the left side of my Visio drawings to be active, this is maddening. There is also no command that you can use to force a failover (well, you could reboot one of them, but that seems excessive). Because I get to work at Arista, I asked the developers why they would deprive me of forced-failover joy, and the answer I received was basically that there is no reason or benefit to having one side be primary over the other. It’s taken me years to accept that, but in my experience, it’s true. I’ve moved on and let it go. Mostly.

When the primary switch reboots for whatever reason, the secondary switch becomes primary. Note that the MLAG System ID remains the same. Remember that in my lab that Arista-A was primary, so I went and rebooted it. With it rebooting, I looked at Arista-B:

Arista-B#sho mlag det | grep State
State                           :             primary
Peer State                      :             primary
State changes                   :                   9

Curious as to why both sides are primary, I asked the developers who said that this is by design because this peer last saw the other peer as primary and assumes that it still is, but because it’s lost its connection, it has also assumed the role of primary. When the other side comes up and communicates again, the status will change. Indeed, after the other switch comes back up, we see a better status:

Arista-B#sho mlag det | grep State
State                           :                  primary
Peer State                      :                secondary
State changes                   :                        9

Remember, if Arista-A failed outright and I replaced it with a new switch, there would no longer be a physical switch with that MAC address in the mix, but the MLAG System ID would remain unchanged unless MLAG is completely deconfigured from both switches in the MLAG domain.

After the Arista-A switch comes back up it remains the secondary switch even though it has the lower system MAC address because there is no preemption. Again, it doesn’t matter which side is primary, so the switches don’t fail over unless there is an outage. Arista does not do preemption because that would just cause more potential network instability, so why force it?

Arista-A#sho mlag det | grep State
State                           :            secondary
Peer State                      :              primary
State changes                   :                    2

When Arista-A (the original primary that failed) comes back online, all of the interfaces on that switch with the exception of L3 interfaces and the MLAG peer-link pairs are set to errdisabled:

Arista-A#sho int status
Port       Name               Status       Vlan      Duplex Speed  Type
Et1                           errdisabled  1         auto   auto   1000BASE-T
Et2                           errdisabled  1         full   10G    Not Present
Et3                           errdisabled  1         full   10G    Not Present
Et4                           errdisabled  1         full   10G    Not Present
Et5                           errdisabled  1         full   10G    Not Present
Et6                           errdisabled  1         full   10G    Not Present
Et7                           errdisabled  1         full   10G    Not Present
[--output removed--]
Et45                          errdisabled  1         full   10G    Not Present
Et46                          errdisabled  1         full   10G    Not Present
Et47       [ MLAG Peer ]      connected    in Po1000 full   10G    10GBASE-CR
Et48       [ MLAG Peer ]      connected    in Po1000 full   10G    10GBASE-CR
Et49/1                        errdisabled  1         full   100G   Not Present
Et50/1                        errdisabled  1         full   100G   Not Present
Et51/1                        errdisabled  1         full   100G   Not Present
Et52/1                        errdisabled  1         full   100G   Not Present
Et53/1                        errdisabled  1         full   100G   Not Present
Et54/1                        errdisabled  1         full   100G   Not Present
Ma1                           connected    routed    a-full a-1G   10/100/1000
Po1        [ Arista-C ]       notconnect   1         full   unconf N/A
Po1000     [ MLAG Peer-Link ] connected    trunk     full   20G    N/A

This is to protect your network. If there were something more seriously wrong and this switch were endlessly rebooting, you wouldn’t want all of your connected port-channels to bounce and rehash constantly. Think of this as a type of hold-down timer that lets the network stabilize after an outage or planned reboot.

How long do they stay errdisabled? The default reload-delay timer is 300 seconds by default for fixed-configuration switches, and 1200 or 1800 seconds for chassis-based switches depending on the hardware platform (starting around EOS 4.21 or so). You can change this behavior with the mlag configuration command reload-delay seconds. Any value configured below the default will result in a warning when a reload is done (see “MLAG In-Service Software Upgrade”):

Arista-A(config)#mlag configuration
Arista-A(config-mlag)#reload-delay 120

On newer EOS code you can actually define the behavior of non-MLAG interfaces separately from those that belong to an MLAG. A non-MLAG interface is one that does not participate in an MLAG, which I suppose is pretty obvious. The reason for this ability is really to allow L3 interfaces to come up before the L2 MLAG port-channels so that routing protocols can stabilize before MLAGs are rehashed. We do this by using the reload-delay non-mlag timer command:

Arista-A(config-mlag)#reload-delay non-mlag 60

Remember, all MLAG configurations should be the same on both sides:

Arista-B(config)#mlag configuration
Arista-B(config-mlag)#reload-delay 120
Arista-B(config-mlag)#reload-delay non-mlag 60

You can see how much time is left with the show mlag and show mlag detail commands during a reload:

Arista-A#sho mlag det | grep state
state                           :       Active/Reload (0:01:55 left)
Last state change time          :                        0:00:22 ago
P2p mount state changes         :                   	              1

You can see what the configured delay is by using show mlag detail:

Arista-A#sho mlag detail | grep delay
Reload delay                    :             		         120 seconds
Non-MLAG reload delay           :                	        60 seconds

You can see whether MLAG is what’s holding down your interfaces with the show mlag detail command, as well:

Arista-A#sho mlag det | grep err
Ports errdisabled               :                               True

Again, this allows all of the upper-level protocols to stabilize before traffic is forwarded over the links. Additionally, ports don’t always come up in the order in which we might expect. For example, the peer-link should always come up first in order for MLAG to work properly, but I always configure the peer-link to be the last ports on the switch. If the switch were to initialize ports in the order in which they are shown in the configuration, the peer-link would come up last. The delay is applied to all non-peer-link ports to prevent that from happening.

Again, you can configure this interval by using the reload-delay command within the MLAG configuration, although you should take care when altering this value given that network instability can result when the delay is too short.

Note

The time it takes for a switch to finish booting varies based on the number of ports in the switch and the complexity of the configuration. For example, a 7516R with more than a 1,000 ports will take a bit longer to come up than a 7150 with only 24 ports. The 300-second timer value was chosen as a conservative value for a typical 1–rack unit (RU) switch. If you’re using chassis switches with hundreds of ports, the value might need to be higher, and Arista recommends 12 minutes (720 seconds) for big chassis deployments.

Remember that the other link in the MLAG interface (e33 on Arista-B in this example) is up and forwarding traffic during the Arista-A outage. So long as your devices are dual homed to both switches using MLAG, they should stay online while one of the switches in the MLAG pair reboots.

Dual-Primary Detection

Split-brain is the scenario in which the peer-link somehow fails completely and both MLAG peers become primary devices. That’s considered bad, though surprisingly in a truly dual-homed environment in which everything is working at L2, it might not be the end of the world. But let’s assume that it’s bad (because it usually is) and see what we can do to prevent it.

Arista calls a split-brain situation dual-primary and has thus created a feature in EOS 4.21 called dual-primary detection. This is similar in principle to that other vendor’s feature called the Peer Keepalive Link in vPC. To configure dual-primary detection you must set the peer-address heartbeat ip-address mlag configuration command on each side. Here is the configuration for Arista-A:

Arista-A(config)#mlag configuration
Arista-A(config-mlag)#peer-address heartbeat 10.0.0.8

Here is the matching configuration on Arista-B. For each switch, these IP addresses are the management interface IPs on the other peer.

Arista-B(config)#mlag configuration
Arista-B(config-mlag)#peer-address heartbeat 10.0.0.7

With that configured you can also alter the behavior should a dual-primary state be detected with the dual-primary command. The only real option here is the number of seconds of delay, which you can set from 1 to 1,000 seconds (the last keyword all-interfaces in the command has been abbreviated to make it fit on the page). I’ve configured it the same way on both sides:

Arista-A(config-mlag)#dual-primary detection delay 10 action errdisable all-int
Arista-B(config-mlag)#dual-primary detect. delay 10 action errdisable all-int

To see whether dual-primary is configured, use the show mlag detail command:

Arista-B#sho mlag detail | grep -i  Dual
dual-primary detection 		:	     Configured
Dual-primary detection delay    :                    10
Dual-primary action             :        errdisable-all

With this configured, if the peer-link should go down (you can’t shut down the peer-link with interface commands, so it would need to be a hard failure), whichever switch is secondary will take over as primary immediately but will then start dual-primary detection, which basically listens for heartbeats from the configured IP address configured in the heartbeat command. It does this only after the delay (if so configured). If dual-primary is detected, it will err-disable all interfaces. When and if the dual-primary state clears, normal MLAG operation should continue.

Bow Tie MLAG

What if you need to connect one MLAG pair to another MLAG pair (or a pair of Cisco switches using vPC, etc.)? Guess what? Wait for it…nothing changes. Well, you get to use the terrible phrase Bow Tie MLAG, so that’s something.

Remember, MLAG exists to trick LACP into working. MLAG does not need to be “compatible” with another vendor’s solution because the LACP implementation already works. Cisco’s vPC solution accomplishes much the same thing (though internally in very different ways), so all an Arista MLAG pair should see from vPC is LACP, and all a Cisco vPC pair should see from Arista is, again, LACP.

The two switches on the top (A and B) in Figure 18-10 are an MLAG pair, and the two switches on the bottom (C and D) are an MLAG pair. To connect them together as shown, each pair should have its own MLAG domain ID. Actually, that really doesn’t matter—they can be the same—which is contrary to what I wrote in the first edition. The MLAG domain is locally significant to the MLAG domain (it doesn’t leak out) unless you try to attach a third switch somehow, which isn’t allowed, anyway.

Multiple MLAG domain ID
Figure 18-10. Multiple MLAG domain ID

What you’ll find if you build this is that it will work if they all have the same MLAG domain. So why require an MLAG domain at all? To make sure that the two configured devices should really be peering. I don’t like having multiple pairs with the same MLAG domain name, but I’ve seen it more than once. Similarly, because the MLAG configuration is local to the peers, I’ve seen multiple MLAG pairs in an environment using the same IP addresses for the peer-links on each pair! I don’t recommend this, but it does seem to work, and I’ve seen many customers who have done this. If it were my network, I can tell you that you’d be fixing that, though. While it might work at L2, if you then migrate to an L3 dynamic environment and do something like redistribute-connected, you’ll get those IP addresses advertised from every pair.

MLAG In-Service Software Upgrade

MLAG In-Service Software Upgrade (ISSU) is a feature enabled on EOS version 4.9.3 and later, and at this point I really hope you’re using code that’s much later than 4.9.3. With MLAG ISSU, you can upgrade an MLAG switch pair with minimal (subsecond) packet loss and no STP reconvergence. Without MLAG ISSU or if you upgrade while ignoring the switch’s dire warnings regarding the state of MLAG ISSU, you’ll likely have one or more network topology changes that will result in one or more STP reconvergence events, and no one wants that.

The Arista documentation on MLAG ISSU indicates that the following steps need to be followed in this order to properly upgrade an MLAG ISSU switch pair:

  1. Verify primary/secondary state of MLAG on each switch using the show mlag detail command, or to be brief, the show mlag det | grep State (with a capital “S”) command.

  2. Ensure configuration consistencies.

  3. Resolve ISSU warnings (from the output of reload).

  4. Upgrade MLAG secondary switch.

  5. Monitor MLAG status using show mlag detail.

  6. Confirm MLAG secondary status.

  7. Upgrade MLAG primary peer switch.

  8. Confirm overall MLAG status.

Note

When upgrading chassis switch peers that contain dual supervisors, you’ll need to upgrade the standby supervisors on both switches, then upgrade the active supervisor on the MLAG secondary, and finally upgrade the last remaining supervisor.

By having switches running MLAG ISSU code, the switches will know whether they can be upgraded without causing an outage. If they cannot, the switch will give you a warning when rebooting. Here’s an example of such a warning on a switch running 4.21.1F:

Arista-A#reload
If you are performing an upgrade, and the Release Notes for the new
version of EOS indicate that MLAG is not backwards-compatible with the
currently installed version (4.21.1F), the upgrade will result in
packet loss.

Stp is not restartable. Topology changes will occur during the upgrade
process.

The following MLAGs are not in Active mode. Traffic to or from these
ports will be lost during the upgrade process:
                                                                local/remote
   mlag     desc                      state     local     remote      status
---------- ---------------- ---------------- --------- ---------- ----------
      1     [ Arista-C ]     active-partial       Po1        Po1     up/down


The configured reload delay of 120 seconds is below the recommended
value of 300 seconds. A longer reload delay allows more time to
rollback an unsuccessful upgrade due to incompatibility.
System configuration has been modified. Save? [yes/no/cancel/diff]:

As I often joke in my classes, network engineers seem genetically predisposed to being incapable of reading walls of text. If you see a bunch of text like this after typing reload, read the damn screen!

Warning

Using the reload now command will cause the switch to bypass these warnings, so don’t use the reload now command when doing an MLAG ISSU upgrade. This is not meant to be a trick to avoid walls of text, even if that’s why a bunch of us do it.

Here’s a list of common ISSU warnings and the ways to resolve them.

Compatibility check

The version to which you’re upgrading might not be compatible with the version you’re on. But then again, it might! Read the release notes to make sure that it is.

STP is not restartable

Usually waiting 30 to 120 seconds will reward you with this warning resolving itself. To see the status of STP restartability (I totally made that word up), use the show spanning tree bridge detail command:

Arista-A#sho spanning-tree bridge detail | inc agent
   Stp agent restartable                      :         True
Active-partial MLAG warning

The MLAG shown is not active on the other switch in the MLAG pair. If it should be, bring it up. This is a warning that you’ll end up black-holing a device if you continue the reload, so make sure that this is what you’re expecting.

Reload delay too low

Remember the reload delay we talked about earlier in this chapter? Well, if the switch thinks that it’s too low (lower than the default of 300 seconds for top-of-rack switches and 600 seconds for modular switches), it will bark at you with this warning.

Peer has errdisabled interfaces

This is usually an indication that you’re impatient and haven’t waited long enough for the peer to reboot. Remember, the peer’s MLAG-enabled interfaces will stay in an errdisabled state for the duration of the reload delay after booting, assuming the other switch is up, and if you’re on a switch that shows this warning, that’s a good assumption.

The biggest step you should take before considering an MLAG ISSU upgrade is to carefully read the release notes and Transfer of Information (TOI) documents found on the Arista support site. You can find them alongside the EOS binary images. Don’t be afraid to call or email your Arista sales engineer or open a TAC case either. Some shops don’t do upgrades often enough to remain sharp on the syntax and gotchas, and these folks love to help.

Layer 3 with MLAG

For an example of using Layer 3 with MLAG, check out Chapter 21 which builds an L3 Equal-Cost MultiPathing (ECMP) network including VXLAN terminating on an MLAG pair.

Spanning Tree and MLAG

When MLAG is configured, one of the switches in the MLAG cluster will become the primary switch. The MLAG primary switch will do all of the STP processing, and changes to the secondary will have no effect. There is a pretty big caveat to that statement, though, and that is that changes made to the secondary MLAG switch’s STP configuration will be accepted to the running-config, but they will not take effect unless, that is, the primary MLAG switch relinquishes its role as primary, at which point all of the commands entered on the secondary (now primary) switch will suddenly become active. What’s worse, you might not see this coming. Allow me to demonstrate.

I have two switches, Arista-A and Arista-B, configured as an MLAG pair. I have STP left to defaults, and Arista-A is the primary switch in the MLAG domain. I’ll be working on Arista-B, so here’s proof that it’s the MLAG secondary switch:

Arista-B(config)#sho mlag detail | grep State
State                           :           secondary
Peer State                      :             primary
State changes                   :                   2

And here’s the Spanning Tree status:

Arista-B(config)#sho spanning-tree
MST0
  Spanning tree enabled protocol mstp
  Root ID    Priority    32768
             Address     2899.3a06.6769
             Cost        0 (Ext) 5999 (Int)
             Port        100 (Port-Channel1)
             Hello Time  2.000 sec  Max Age 20 sec  Forward Delay 15 sec

  Bridge ID  Priority    32768  (priority 32768 sys-id-ext 0)
             Address     2a99.3a06.6e0f
             Hello Time  2.000 sec  Max Age 20 sec  Forward Delay 15 sec

Interface        Role       State      Cost      Prio.Nbr Type
---------------- ---------- ---------- --------- -------- --------------------
Et1              designated forwarding 20000     128.247  P2p Edge                      
Et34             alternate  discarding 2000      128.234  P2p                          
PEt1             designated forwarding 20000     128.1    P2p Edge                      
PEt34            alternate  discarding 2000      128.34   P2p                          
Po1              root       forwarding 1999      128.100  P2p

Now, I’ll go into that switch (Arista-B) and start mucking with STP. I want to make the priority lower to force it to be the root:

SW1(config)#spanning-tree root primary

When I make this change, nothing happens:

Arista-B(config)#sho spanning-tree | grep Priority
  Root ID    Priority    32768
  Bridge ID  Priority    32768  (priority 32768 sys-id-ext 0)

Frustrated because my change has no effect, I decide to hardcode the priority even lower:

Arista-B(config)#spanning-tree priority 4096

Huh–still no change:

Arista-B(config)#sho spanning-tree | grep Priority
  Root ID    Priority    32768
  Bridge ID  Priority    32768  (priority 32768 sys-id-ext 0)

Beyond frustrated, I start to drink heavily because nothing makes a network change go more smoothly than alcohol.

If I hardcoded the priority to primary (8192) and then 4096, why didn’t it show my change? Disgusted and impatient, I rebooted the other switch, because that was so much easier than reading the documentation. Imagine, though, that instead of me rebooting a switch in a lab that these switches were in production, and after my changes didn’t work, I gave up and walked away. You know, because that’s what happens in real data centers. Anyway, for whatever reason, maybe months later, Arista-A (the primary MLAG switch) reboots. I’ll simulate this with a hard reload of Arista-A:

Arista-A(config)#reload now

Broadcast message from root@Arista-A (Sat Jan 26 21:20:42 2019):

The system is going down for reboot NOW!

All of a sudden and without any real warning, Arista-B is the now root bridge with a priority of 4096:

Arista-B(config)#sho spanning-tree | grep Priority
  Root ID    Priority     4096
  Bridge ID  Priority     4096  (priority 4096 sys-id-ext 0)

This happened because Arista-B is now the MLAG primary, as evidenced by the output of show mlag detail | grep state:

Arista-B(config)#sho mlag det | grep State
State                           :             primary
Peer State                      :             primary
State changes                   :                   3
Warning

The fact that this happens like this is not really a problem; it is functioning by design. The problem is that when configuring STP on the secondary MLAG switch, there are no warnings that your changes are being saved, and no warnings that any changes made will take effect when and if this switch becomes the primary. Be very careful about making changes to STP when configuring the MLAG secondary switch.

This behavior was recorded on switches running EOS 4.21.1F. When I told Arista about it some six years ago, developers there told me that “the configuration should be the same on both peers.” Um...thanks.

To be fair, the developers have since added the show mlag config-sanity command, and had I followed my own advice from earlier in the chapter and issued that command at the end of my change control instead of walking away and not backing out my changes (honestly, I would probably have fired myself if I’d done that), the switch would have told me that I was in danger. Or would it?

Sadly, this is one of the few things that show mlag config-sanity does not check. I asked the developerss about this, and they said that it was by design without any further explanation. Here’s proof of the fact that it’s not included. First, here’s the relevant configuration from Arista-A:

Arista-A#sho run section span
spanning-tree mode mstp
no spanning-tree vlan 4094

And here is Arista-B’s relevant configuration:

Arista-B#sho run section span
spanning-tree mode mstp
no spanning-tree vlan 4094
spanning-tree mst 0 priority 4096

As you can see, Arista-B has a Spanning Tree priority of 4096, which is a big change from the default of 32768 on Arista-A. Here’s what show mlag config-sanity says on Arista-A:

Arista-A#sho mlag config-sanity 
No global configuration inconsistencies found.

No per interface configuration inconsistencies found.

Here’s what show mlag config-sanity says on Arista-B:

Arista-B#sho mlag config-sanity 
No global configuration inconsistencies found.

No per interface configuration inconsistencies found.

The lesson to learn here is that the configurations should be the same on both peer switches, and you should always make sure that’s the case both with the show mlag config-sanity command and something like the sho run section span (or similar) command.

Conclusion

One last note, because this comes up a lot: no, you should not disable STP if you’re using MLAG (or any vendor’s MLAG-like technology). Ask any networking consultant whether they’ve heard of a Spanning Tree event being caused by someone bringing in a home office switch and connecting it where it didn’t belong. I know I’ve seen that more than once. Hell, I had a client who refused to run more than two Ethernet runs to each cube, insisting that should anyone need more ports, they could just bring in a switch from home. This is an outage waiting to happen, and STP is the last line of defense against the loop-inducing server guy who needs 14 ports on his desk. Do yourself a favor and outlaw switches on (or under) desks. And keep STP running, because when you outlaw desktop switches, only outlaws will have desktop switches...or something.

MLAG works great if you’re in need of a multihomed L2 design. I’ve taught people who favor end-to-end L3 designs who seem to get angry that MLAG exists, which always kind of amuses me. Arista is not forcing anyone to use any one design over another. If you need an L2 solution, MLAG is great. If you need L3, go for it.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset