Automatic Hardware Trigger of APM

The Causes

When both the local and remote QP or EEC are in the Arm state, they are primed for a path migration. As indicated earlier, a path migration can be triggered either when:

  • Local software commands (via the Modify QP or Modify EEC verb) the local QP or EEC to transition from the Arm state to the Migrated state or

  • The QP or EEC experiences Retry Count Exhaustion.

An automatic path migration is triggered on Retry Count exhaustion and can occur due to any of the following circumstances:

  • Repetitive PSN Sequence Error Naks. If either QP's or EEC's Send Logic receives a PSN Sequence Error Nak, it decrements its Retry Count. If the count is not exhausted, it retries the transmission of that request packet. If it receives repetitive PSN Sequence Error Naks for the same request and exhausts its Retry Count, and APM is enabled (i.e., the QP or EEC is in the Arm state), this triggers a migration. See “The Migration” on page 585 for a description of the migration process.

  • Repetitive Transport Timer timeouts. If either QP's or EEC's Send Logic experiences a Transport Timer timeout (due to an expected response not arriving), it decrements its Retry Count. If the count is not exhausted, it retries the transmission of the corresponding request packet. If it experiences repetitive Transport Timer Timeouts for the same request and exhausts its Retry Count, and APM is enabled (i.e., the QP or EEC is in the Arm state), this triggers a migration. See “The Migration” on page 585 for a description of the migration process.

  • Repetitive missing RDMA Read response or Atomic response. If either QP's or EEC's Send Logic detects a missing RDMA Read or Atomic response packet (referred to as an implied Sequence Error Nak), it decrements its Retry Count. If the count is not exhausted, it retries the transmission of that RDMA Read or Atomic request packet. If it experiences repetitive implied Sequence Error Naks for the same request and exhausts its Retry Count, and APM is enabled (i.e., the QP or EEC is in the Arm state), this triggers a migration. See “The Migration” on page 585 for a description of the migration process.

The Migration

When a migration is triggered (either by Retry Count exhaustion or by a software command transitioning a QP or EEC from the Arm to the Migrated state), the following actions are taken:

  1. The QP or EEC triggered transitions from the Arm to the Migrated state.

  2. That QP or EEC copies the alternate path information in its context into its primary path variables.

  3. The next packet transmitted by that QP or EEC has BTH:MigReq = 1 and is transmitted over the new path.

  4. If the CA the QP or EEC resides in is an HCA, the migration event causes the Asynchronous Event Handler to be called (typically via an interrupt) and reports the path migration.

  5. The QP or EEC reloads its Retry Count with its initial value and begins retrying the request in question again. If it once again exhausts its Retry Count, the QP or EEC takes the actions described in one of the following:

    - For a PSN Sequence Error Nak, see “On Exhaustion” on page 402.

    - For a Transport Timer timeout, see “On Retry Count Exhaustion” on page 396.

    - For an implied Sequence Error Nak, see “On Exhaustion” on page 402.

  6. When the remote QP or EEC receives the packet (over the new path) with BTH:MigReq = 1, it compares the received packet's SLID and DLID (and, if it's a global packet, the packet's SGID and DGID) to its alternate path information:

    - Miscompare. If it doesn't match, the packet is silently dropped and the QP or EEC doesn't change its migration state and doesn't accept the migration request. The CA calls the Affiliated Asynchronous Event handler and reports a Path Migration Request Failed Affiliated Asynchronous Error for that QP or EEC.

    - Match. If the information does match, the QP or EEC continues with the migration.

  7. The QP or EEC transitions from the Arm to the Migrated state.

  8. The QP or EEC copies its alternate path information to its primary path variables.

  9. The QP or EEC sends the next packet using the new path and sets BTH:MigReq = 1 in the packet.

  10. If the CA the QP or EEC resides in is an HCA, the migration event causes the Asynchronous Event Handler to be called (typically via an interrupt) and reports the path migration.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset