Detailed Description of VL Arbitration

The Problem

At a given moment, a port may have a number of packets ready to transmit:

  • Data packets may be queued up in one or more of the data VL transmit buffers (VL[14:0]).

  • The port's SMI may need to transmit an SMP:

    - The SM may live behind this port's SMI and it needs to send an SMP request to another device in the fabric.

    - This port may need to send an SMP response or a SubnTrap(Notice) back to the SM.

  • One or more of the data VL receive buffers may need to send link-level Flow Control Packets (FCPs) to their respective data VL transmit buffers on the other end of the link.

In a port that only implements a single data VL (VL0), the question remains: In what order do these entities (the data VL, the SMP VL15, the Flow Control logic) get to use the link transmitter and, when it is your turn, how many packets can be transmitted before yielding to the next entity?

It should be obvious that an arbitration scheme must be implemented. The following two sections define the arbitration scheme:

  • On a port with a single data VL (VL0)

  • On a port with multiple data VLs (2, 4, 8 or 15 VLs)

Arbitration on a Port with Single Data VL

In this case, the arbitration is as follows:

  1. SMPs. The transmission of SMPs from the VL15 transmit buffer has priority over the transmission of FCPs and data packets.

  2. FCPs. The transmission of FCPs has lower priority than SMPs, but higher priority than data packets.

  3. Data packets. The transmission of data packets from data VL0 has the lowest priority.

Arbitration on a Port with Multiple Data VLs

Overall Arbitration Scheme

In this case, the overall priority scheme is as follows:

  1. SMPs. VL15 (SMP MAD packets) has the highest priority.

  2. FCPs. FCPs have second-highest priority.

  3. Data VLs. The algorithm described in the next two sections (“The Arbitration Elements” on this page and “Data VL Arbitration Description” on page 633) describes arbitration when multiple data VLs have data packets to transmit.

The scheme is preemptive:

  • Whenever the port has FCPs to transmit, they take priority over data packets.

  • Whenever the port has SMPs to transmit, they take precedence over FCPs and data packets.

The Arbitration Elements

The elements involved in arbitrating among multiple data VLs are:

  • The port's VLArbitrationTable attribute. This table is sub-divided into two subtables, each of which is divided into an upper and a lower half (each containing 64 entries; see Figure 25-7 on page 631):

    - High-Priority Table. This table defines the data VLs in the high-priority rotation and the order in which these VLs are serviced. Each entry (see Figure 25-8 on page 632) identifies a VL transmit buffer and also defines the amount of data (referred to as the VL weight; it is specified as a number of 64-byte blocks) that can be sent from that VL's transmit buffer each time its turn comes up.

    - The minimum number of valid High-Priority Table entries is one (an entry is valid if its weight value is non-zero).

    - The maximum number of valid High-Priority Table entries is 64.

    - The actual size of the High-Priority Table is indicated in the PortInfo.VLArbitrationHighCap attribute element.

    Figure 25-8. VL Table Entry Format

    - Low-Priority Table. This table defines the data VLs in the low-priority rotation and the order in which these VLs are serviced. It also defines the amount of data (referred to as the VL weight; it is specified as a number of 64-byte blocks) that can be sent from a VL's transmit buffer each time its turn comes up:

    Figure 25-7. Common Format of High- and Low-Priority VL Arbitration Tables

    - The minimum number of valid Low-Priority Table entries is equal to the number of data VLs the port implements.

    - The maximum number of valid Low-Priority Table entries is 64.

    - The actual size of the Low-Priority Table is indicated in the PortInfo.VLArbitrationLowCap attribute element.

  • VLHighLimit attribute. Refer to Figure 25-9 on page 632. This 8-bit, read/write attribute defines the total amount of data that can be transferred from the VLs listed in the High-Priority Table before switching to and servicing the next VL listed in the Low-Priority Table:

    - 00h: only one packet can be sent.

    - 01h-FEh: Value X 4KB (4KB-to-1016KB) can be sent.

    - FFh: no limit.

    - As long as the byte count has not exceeded the limit, additional high-priority packets can be sent.

    Figure 25-9. High-Priority Limit

  • High-Priority Transfer Counter (HighPriCounter). This strictly internal counter (a device-specific register, not an addressable attribute) that tracks how many bytes have been transferred by the high-priority data VLs. Note that the count changes in 4-byte increments.

Data VL Arbitration Description
Assumptions

During the following discussions, refer to Figure 25-10 on page 636. This discussion assumes the following:

- Both tables were programmed (by the SM) as indicated in the illustration.

- No data packets have been transmitted since the tables were set up.

- All data packets to be transmitted have 4KB payloads (remember that the PMTU for the path a message's packets are to traverse can be 4096, 2048, 1024, 512, or 256 bytes.)

- The VLHighLimit attribute has been programmed (by the SM) with the value four, indicating that 16,384 bytes (4 X 4KB) may be transmitted from the VLs in the High-Priority Table before the arbiter must switch to the Low-Priority Table and service its next entry.

- Because the VLHighLimit has been initialized to 16KBs, the HighPriCount register is initialized to a count of 4096 dwords (16KB). This counter counts the number of dwords that have been transmitted by the VLs listed in the High-priority Table.

- The High-Priority Table pointer and the Low-Priority Table pointers are each currently pointing at the first entry in the two tables.

- The VL15 SMP transmit buffer is currently empty and none of the data VL receive buffers are ready to transmit FCPs to the respective data VL transmit buffers on the opposite end of the link.

- It is currently the High-Priority Table's turn to be serviced.

- All of the data VLs listed in the two tables currently have multiple packets to be transmitted.

- All of the VL transmit buffers listed in the two tables have received sufficient Flow Control credits from their respective remote receive buffers to transmit all of the packets they have buffered up awaiting transmit.

Figure 25-10. Example Data VL Arbitration Scenario


Conditions Necessary to Transmit from a High-Priority VL

In order to transmit one or more packets from the currently active entry in the High-Priority Table, the following conditions must all be true:

- The weight remaining for the currently active list entry is still positive.

- The HighPriCount is still positive.

- There is a packet available in the VL identified in the entry.

- Sufficient Flow Control credits have been received from the remote port's corresponding data VL receive buffer to transmit at least one packet. Note that each credit received indicates 64 bytes of buffer space available in the remote port's corresponding data VL receive buffer.

High-Priority Table Operation

In the example, the first entry in the High-Priority Table is currently active. This entry indicates that it's VL6's turn. The following actions are taken:

VL6 transmits two 4KB packets:

  1. The VL6 transmit buffer has multiple packets to transfer, each with a payload of 4KB. It also has sufficient credits to transfer all of the packets it has buffered up. The entry has been given a weight value of 127d (128 X 64 = 8128 bytes).

  2. VL6 transmits a 4KB packet and subtracts 64 from its Weight value (4096 bytes = 64 blocks of 64 bytes each have been transferred). Its new weight value is therefore 63 (127 – 64).

  3. 1024 (the size of the packet in dwords as defined by LRH:PktLen field) is deducted from the current value of the HighPriCounter, yielding 3072. As long as the value doesn't go negative, packets can be transferred by the VLs listed in the High-Priority Table.

  4. Since the HighPriCount is not negative, the arbiter continues to service the High-Priority Table entries.

  5. Since the current entry's Weight has not yet been exhausted, VL6 can transmit another packet.

  6. VL6 transmits another 4KB packet and subtracts 64 from its Weight value. Its new weight value is therefore 63 – 64 = –1. It has used up its Weight value so VL6 must stop transmitting packets.

  7. 1024 (the size of the packet in dwords as defined by the LRH:PktLen field) is deducted from the current value of the HighPriCounter (3072), yielding 2048. As long as the value doesn't go negative, packets can be transferred by the VLs listed in the High-Priority Table.

VL1 transmits one 4KB packet:

  1. Since the HighPriCount value is not negative (currently = 2048), the arbiter advances to the next entry in the High-Priority Table. This entry indicates that it's VL1's turn and the Weight (63) indicates that it can transmit up to 63 X 64 = 4032 bytes (assuming that it has a packet to transmit and has sufficient credits to do so).

  2. VL1 transmits a 4KB packet and subtracts 64 from its Weight value. Its new weight value is therefore –1 (63 – 64).

  3. 1024 (the size of the packet in dwords as defined by the LRH:PktLen field) is deducted from the current value of the HighPriCounter (2048), yielding 1024. As long as the value doesn't go negative, packets can be transferred by the VLs listed in the High-Priority Table.

VL7 transmits one 4KB packet:

  1. Since the HighPriCount value is not negative (currently = 1024), the arbiter advances to the next entry in the High-Priority Table. This entry indicates that it's VL7's turn and the Weight (254) indicates that it can transmit up to 254 X 64 = 16,256 bytes (assuming that it has a packet to transmit and has sufficient credits to do so).

  2. VL7 transmits a 4KB packet and subtracts 64 from its Weight value. Its new weight value is therefore 190 (254 – 64).

  3. 1024 (the size of the packet in dwords as defined by the LRH:PktLen field) is deducted from the current value of the HighPriCounter (1024), yielding 0. As long as the value doesn't go negative, another packet can be transmitted by a VL listed in the High-Priority Table.

VL7 transmits another 4KB packet:

  1. Since the current entry's Weight has not yet been exhausted (it's 190), VL7 can transmit another packet.

  2. VL7 transmits a 4KB packet and subtracts 64 from its Weight value. Its new weight value is therefore 126 (190 – 64).

  3. 1024 (the size of the packet in dwords as defined by the LRH:PktLen field) is deducted from the current value of the HighPriCounter (0), yielding a negative value. When the value goes negative, no more packets can be transferred by the VLs listed in the High-Priority Table. The HighPriCount is reinitialized to the value from the VLHighLimit attribute and the arbiter switches to the currently active entry in the Low-Priority Table (the example continues in the next section).

Note that the currently active High-Priority Table entry for VL7 still has some Weight left (126), so the arbiter will resume with this entry when control switches back to the High-Priority Table.

Low-Priority Table Operation

The arbiter has now switched to the currently active entry in the Low-Priority Table. It indicates that it is VL3's turn and it has a Weight of two (128 bytes). Since this is a positive number, a packet can be transmitted.

  1. This entry indicates that it's VL3's turn and the Weight (2) indicates that it can transmit up to 128 bytes (assuming that it has a packet to transmit and has sufficient credits to do so). As long as the Weight is still is a positive number, a packet can be transmitted.

  2. VL3 transmits a 4KB packet and subtracts 64 from its Weight value. Its new weight value is therefore negative (2 – 64).

  3. The pointer for the Low-Priority Table advances to the next entry.

  4. Control is passed back to the High-Priority Table (to the entry pointed to by the pointer).

Specification Is Ambiguous

Please note that it is the author's opinion that the specification is ambiguous with regard to how much data can be transferred when the arbiter switches to the Low-Priority Table. The specification states:

“If the High-Priority table does not have an available packet for transmission (as defined above), or if the HighPriCounter has expired, then the HighPriCounter shall be reset, the Low-Priority table is said to be active and a packet may be sent from the Low-Priority table.”

This could be construed as meaning that only a single packet can be transmitted from the VL buffer indicated by the currently active Low-Priority Table entry before the arbiter switches back to the other table.

It is the author's opinion, however, that the currently active entry can transmit packets until it exhausts its Weight before the arbiter switches back to the High-Priority Table.

Additional Information on Multiple Data VL Arbitration
  • If a list entry is programmed for:

    - VL15, or

    - a VL that is not supported by this port's Link Layer, or

    - a VL implemented on the port but not enabled by software, then the port may either skip that entry or send from any other VL supported by the port.

  • The same data VL may be listed multiple times in the High- or Low-Priority Tables and may be listed in both tables.

  • Each enabled data VL should be listed in at least one of the tables. There is, however, no requirement for a device to check for this case.

  • Should an enabled data VL not appear in either table, packets for that data VL may be dropped, may be sent when the arbiter has no packets to send for other data VLs, or may never be sent.

  • If the VLHighLimit attribute is set to 255d, the High-Priority Table's bandwidth is unbounded. Note, however, that forward progress of the VLs listed in the Low-Priority Table is not guaranteed in this case.

  • A VLHighLimit value of zero indicates that only a single packet from the High-Priority Table may be sent before an opportunity is given to the Low-Priority Table.

  • The VLArbitrationTable may be modified when the port is actively accepting and transmitting packets. This modification must not result in fragmentation of any packet that is in transit. It should noted that the arbitration rules may be violated during this change, however.

  • When a CA, router, or switch is initialized, the VLArbitrationTable is not required to be initialized (i.e., its contents are undefined). The table should be initialized by the SM prior to use by data traffic.

  • An entry with a Weight of zero is a null entry and is skipped.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset