RQ Logic's Error Detecting and Handling

Table 17-7 on this page defines the types of errors that can be detected by the RQ Logic and how each is handled.

Table 17-7. RC RQ Logic Error Types and Handling
ErrorDescriptionHandling
Out of sequence request packetPSN of the inbound request packet does not match the QP's RQ Logic's ePSN.Class B error handling:
  • Return an PSN Sequence Error Nak to the remote QP's SQ Logic.

  • No other action is taken.

Malformed RQ WQEResponder detected a malformed RQ WQE while processing an inbound Send or RDMA Write With immediate request packet.Class A error handling:
  • Remote Operational Error Nak returned.

  • Destination QP transitions to the Error state.

  • Destination QP's RQ WQE is retired and an error CQE is created indicating a “Local QP Operation Error.”

  • All remaining RQ WQEs are retired and a CQE is created for each indicating it was flushed and not executed.

  • All SQ WQEs are retired and a CQE is created for each indicating it was flushed and not executed.

Resources Not Ready ErrorWQE or other resource is not currently available, so the QP's RQ Logic has returned an RNR Nak.Class B error handling:
  • Return an RNR Nak to the remote EEC's Send Logic.

  • No other action is taken.

Unsupported or Reserved OpcodeInbound request packet's BTH:Opcode was either reserved or was for a function not supported by this QP (e.g., an RDMA or Atomic request on a QP that doesn't support it).Class C error handling:
  • Invalid Request Nak returned to the remote QP's SQ Logic.

  • The QP enters the Error state.

  • If the message transfer currently in progress uses a RQ WQE:

    - Current WQE is retired.

    - Error CQE created indicating a “Remote Invalid Request” error.

    - Remaining RQ WQEs are retired and a CQE is created for each indicating that it was flushed due to the error on the earlier WQE.

  • If the message transfer currently in progress does not use a RQ WQE, then an Affiliated Asynchronous Error is generated and the Affiliated Asynchronous Event handler is called (typically by generating an interrupt).

Misaligned semaphore start addressIn an Atomic request packet's AtomicETH, the VA is not quadword-aligned.

Class C error handling. See the description of Class C error handling earlier in this table.

Too many RDMA Read or Atomic RequestsThe remote QP's SQ Logic transmitted more RDMA Read or Atomic request packets than the responder QP's RQ Logic can handle. Any one received after the RQ's queue becomes full isn't responded to.

Class C error handling. See the description of Class C error handling earlier in this table.

Current request packet is “First” or “Only” and should have been “Middle” or “Last”The responder was expecting a request packet with a “Middle” or “Last” opcode and received a “First” or an “Only.” Indicates either:
  • One or more “Middle” packets and the “Last” packet of the current message were lost in the fabric.

  • The “Last” packet of the current message was lost in the fabric.

Class C error handling. See the description of Class C error handling earlier in this table.

Current request packet should been a “First” or “Only”The responder was expecting a request packet with a “First “or “Only” opcode, but received a “Middle” or a “Last.” Indicates that one or more request packets were lost in the fabric.

Class C error handling. See the description of Class C error handling earlier in this table.

R_Key ViolationThe QP's RQ Logic detects an R_Key violation while executing an RDMA request.

Class C error handling. See the description of Class C error handling earlier in this table.

Packet Header ViolationThe QP's RQ Logic detected a header violation requiring that the request packet be silently dropped. Figure 17-25 on page 438 illustrates the header validation process.Class D error handling:
  • Silently drop request packet.

  • Don't generate an ACK or Nak.

  • Don't retire a RQ WQE for the current message.

  • Wait for first packet of a new message.

  • The new message must begin at the ePSN.

  • If a RQ WQE was in use, reset it to accept the next incoming Send or RDMA Write with Immediate.

Please note that an approved change to the 1.0a specification has deleted the last four actions from this bullet list.
Length errors
  • Inbound message “Send” operation exceeded the Scatter Buffer List in the RQ WQE.

  • RDMA Write operation contained too much or too little payload data compared to the transfer length advertised in the first or only packet.

  • Payload length was not consistent with the opcode:

    - “Only” must contain 0-to-PMTU bytes.

    - “First” or “Middle” must contain PMTU bytes.

    - “Last” must contain 1 to PMTU bytes.

Class C error handling. See the description of Class C error handling earlier in this table.

Invalid duplicate Atomic RequestDuplicate Atomic request received, but its PSN does not match PSN of a previously executed Atomic Request whose results were saved.See the Class D error handling description in the description of the “Packet Header Violation” in this table.
CQ overflowMessage was fully executed and Ack'd, but CQE could not be written to the CQ. Occurs when the CQ is inaccessible or full and an attempt is made to complete a WQE.Class G error handling:
  • The affected QP transitions to the Error state.

  • No WQEs are retired and no CQEs are created.

  • The current WQE and any subsequent WQEs are left in an unknown state.

  • If the CA is an HCA, the Asynchronous Event Handler is called (typically via an interrupt) and the error is reported as an Asynchronous Affiliated Error.


Figure 17-25. Packet Header Validation


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset