It's All About Message Passing

The underlying concept behind IBA is message passing between CAs. A message is a block of data that is passed from the local memory of one CA to the local memory of another CA.

Specification Usage of “Message” and “Packet”

The reader should note that the specification doesn't always use the terms “message” and “packet” correctly. Sometimes “message” is used where “packet” would be correct and vice versa.

Three Types of Message Transfers

There are three basic message transfer scenarios:

  • Sending a message from the local CA's memory to the destination CA's memory.

  • Reading a message from the destination CA's memory and storing it in the local CA's memory.

  • Performing an atomic RMW (read/modify/write) in the destination CA's memory and storing the returned data in the local CA's memory.

Writing a Message to the Remote CA's Memory

In this case, a CA sends a message from its own local memory to the destination CA's local memory. There are two scenarios:

  • Message Send Operation. In this case, the request doesn't tell the destination CA where to write the data in its local memory. Rather, it's up to the destination CA to determine where to write the data in its local memory. In IBA, this is referred to as a Send operation.

  • Remote DMA (RDMA) Write Operation. In this case, the request specifies where the data is to be written in the destination CA's local memory. In addition to the write data contained in the data payload field, the request packet contains:

    - the memory start address,

    - the transfer length, and

    - a special key indicating that it has permission to perform the write.

    In IBA, this is referred to as an RDMA Write operation.

Reading a Message From the Remote CA's Memory

In IBA, this is referred to as a Remote DMA Read (RDMA Read) operation. A CA issues a request to another CA to read a block of requested read data from its local memory and return it to the requester in a series of one or more RDMA Read response packets. Upon receipt of the requested read data, the requesting CA stores it in a specified area of its own local memory. The RDMA Read request packet contains:

  • The memory start address.

  • The amount of data to be read.

  • A special key indicating that it has permission to read data from that area of the destination CA's local memory.

Performing an Atomic RMW in the Remote CA's Memory

In this case, a CA wishes to perform an atomic RMW (read/modify/write) in the destination CA's local memory. In IBA, there are two forms of atomic RMW operations:

  1. Atomic Fetch and Add operation. The CA issuing the request supplies the destination CA with:

    - the memory address,

    - an Add value,

    - and a special key indicating that it has permission to access the location in the destination CA's local memory.

    Upon receipt of the request, the destination CA reads from the target location in its local memory, adds the Add value to the value read, and writes the result back into the local memory location. The destination CA returns the initial value read back to the CA that issued the request in an Atomic Response packet. Upon receipt of the response packet, the requesting CA writes the read data into its own local memory.

  2. Atomic Compare and Swap If Equal operation. The CA issuing the request supplies the destination CA with:

    - the memory address,

    - a Compare value,

    - a Swap value,

    - and a special key indicating that it has permission to access the location in the destination CA's local memory.

    Upon receipt of the request, the destination CA reads from the target location in its local memory, compares the data read to the Compare value, and, if they are equal, writes the Swap value into the location. The destination CA returns the initial value read to the CA that issued the request in an Atomic Response packet. Upon receipt of the response packet, the requesting CA writes the read data into its own local memory.

What's in a Message?

How the recipient of a message interprets the message is device-specific.

Example Disk Read Request
Step One: Disk Read Issued Via a Message Send Operation

For example, a message might be passed to a mass storage controller using a message Send operation. In this example, the message may contain:

- The type of operation to be performed (disk read or write).

- The identity of the target disk drive (if the disk controller controls an array of disk drives).

- The start cylinder number.

- The surface number.

- The start sector number.

- The number of sectors to be read or written.

- If it's a write operation, the data to be written to disk.

- If it's a read operation, where the return data is to be written in the requesting CA's local memory, as well as a special key that will indicate the disk controller has permission to write to that area of the requesting CA's local memory.

Step Two: Data Read From Disk.

Continuing with this example, after receiving the message containing the above information (sent via the message Send operation), the disk controller determines that it's a disk read request. It reads the requested data from the target drive and stores it in its local memory.

Step Three: Read Data Sent Back Via RDMA Write

Upon completing the disk read, the disk controller then initiates an RDMA Write operation to write the requested disk data into the area of the requesting CA's local memory that was identified in the original message. In the RDMA Write request packet, it supplies the requested data in the packet's data payload field, and also supplies the memory start address, the transfer length, and the special key that it received in the original message.

Step Four: Upon Receipt of RDMA Write Request

Upon receipt of the RDMA Write request packet, the CA that originated the disk read request checks the special key to ensure the writer has permission to write to the indicated area of its local memory. Assuming that the key is correct, it then writes the data into its local memory. Upon completion of the write, the CA signals completion to software (perhaps via an interrupt).

How Big Can a Message Be?

An IBA message transfer can be anywhere from zero to 2GB in size.

What Is the Maximum Size of a Packet's Data Payload Field?

An IBA packet can contain a maximum of 4KB of data. For additional information, refer to “Maximum Data Payload Size” on page 42.

Large Messages Require Multiple Packet Transfers

It should be obvious that when a CA wishes to transfer a message that exceeds the size of a packet's data payload field, the CA must perform a multiple packet transfer in order to transfer the entire message. The following sections define the characteristics of multiple packet message transfers for the various message transfer operation types. It should be noted that in some circumstances, a packet's data payload field may be constrained to a size smaller than 4KB. For more information, refer to “Maximum Data Payload Size” on page 42. The following subsections assume that the maximum allowable data payload field size is 4KB.

Each Packet Contains an Opcode Field

Each packet contains an Opcode field that defines the type of request or response packet. The sections that follow provide some detail on the Opcode types.

Some Request Types Require a Response While Others Don't

While the destination CA is required to return a response for an RDMA Read request (the requested read data is returned) or for either type of atomic RMW request (the data read from the destination CA's local memory is returned), no response is required to be returned for a Send or an RDMA Write operation (although some of the service types require the return of an Acknowledge packet; this is discussed in a later chapter).

Single- and Multi-Packet Send Operations
Single Packet Send

When the message to be sent is no more than 4KB in size, a single request packet with a “Send Only” opcode and a data payload field containing somewhere between zero and 4KB of data is sent to the destination CA.

Multiple-Packet Send

However, when the size of the message to be sent exceeds the size of a packet's data payload field, the Send operation consists of a series of two or more Send request packets:

- When the message to be sent is more than 4KB but not greater than 8KB in size, two request packets are sent to the destination CA:

- a request packet with a “Send First” opcode with a data payload field containing 4KB of data,

- followed by a request packet with a “Send Last” opcode containing somewhere between one byte and 4KB of data.

- When the message to be sent is more than 8KB in size, three or more request packets are sent to the destination CA:

- a request packet with a “Send First” opcode with a data payload field containing 4KB of data,

- followed by one or more request packets, each with a “Send Middle” opcode and 4KB of data,

- followed by a request packet with a “Send Last” opcode containing somewhere between one byte and 4KB of data.

Single- and Multi-Packet RDMA Write Operations
Single Packet RDMA Write

When the size of the message to be written is such that the entire message fits in a single request packet's data payload field, a single request packet is transmitted with an “RDMA Write Only” opcode and a data payload field containing somewhere between zero and 4KB of data.

Multiple Packet RDMA Write

When the size of the message to be written exceeds the size of a packet's data payload field, the RDMA Write operation consists of a series of two or more RDMA Write request packets. There are two possible scenarios:

- A request packet with an “RDMA Write First” opcode with a data payload field containing 4KB of data, followed by a packet with an “RDMA Write Last” opcode containing somewhere between one byte and 4KB of data.

- A request packet with an “RDMA Write First” opcode with a data payload field containing 4KB of data, followed by one or more request packets, each with an “RDMA Write Middle” opcode and 4KB of data, followed by a packet with an “RDMA Write Last” opcode containing somewhere between one byte and 4KB of data.

It should be noted that the first or only request packet of an RDMA Write operation, in addition to a data payload field, also contains the start memory address, the amount of data to be written, and the special key indicat-ing that it has permission to write the data to the destination CA's local memory.

RDMA Read Operation

It only takes one request packet to issue an RDMA Read request to a destination CA. The destination CA then returns the requested read data to the requesting CA in a series of one or more RDMA Read response packets.

  • If all of the requested read data fits into a single packet, then a single RDMA Read response packet is returned with an opcode of “RDMA Read Response Only.” The packet's data payload contains between zero and 4KB of data.

  • If all of the requested read data will fit into two response packets, then:

    - an “RDMA Read Response First” packet with a data payload containing 4KB of data is returned,

    - followed by an “RDMA Read Response Last” packet containing somewhere between one byte and 4KB of data.

  • If the requested read data requires more than two response packets, then:

    - an “RDMA Read Response First” packet with a data payload containing 4KB of data is returned,

    - followed by one or more “RDMA Read Response Middle” packets, each with a data payload of 4KB,

    - and, finally, an “RDMA Read Response Last” packet containing somewhere between one byte and 4KB of data.

Atomic Operation

It only takes one request packet to issue the request and one response packet to return the read data.

  • The request packet contains either a “CmpSwap” or “FetchAdd” opcode, as well as the Compare and Swap data, or the Add data.

  • The single response packet contains an “Atomic Acknowledge” opcode, as well as the data read initially from the targeted memory location in the destination CA's local memory.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset