01signal: List of protocols for communication between FPGAs with Multi-Gigabit Trasnsceivers

This page is the second in a series of pages introducing the Multi-Gigabit Transceiver (MGT).

Introduction

Multi-Gigabit transceivers (MGTs) are the basic building block for many well-known protocols: PCIe, SATA, Gigabit Ethernet, SuperSpeed USB, Thunderbolt and Displayport. All of these protocols have one thing in common: There is a computer involved. There are also several protocols for telecommunications, intended for use to transport phone calls over a optic fiber link.

MGTs are also useful for exchanging data between two FPGAs. A few things to be aware of when using an MGT for this purpose are listed on a previous page. Clearly, there is a need for some kind of protocol to ensure that the data is transmitted properly and with an acceptable reliability over the physical channel. The implementation of such protocol is quite complicated, so the question is if there are ready-made protocols or other building blocks that can help.

This page is attempt to summarize the main alternatives. They are listed below according to the complexity level of the application logic, least complex listed first.

All protocols below can work as a bidirectional link (full duplex) as well as a one-way link (half duplex), unless stated otherwise.

Xillyp2p

Xillyp2p is a proprietary protocol that provides a reliable transport of multiple data streams between two FPGAs. The protocol manages the transmission, flow control, scheduling and retransmission of the application data similar to the way that the TCP/IP protocol transports data over a network: All data is guaranteed to arrive correctly to the other side.

The application logic interacts with the protocol's implementation through standard FIFOs. The protocol creates an illusion of a standard FIFO than spans across the two FPGAs: This virtual FIFO's side for writing is on one FPGA, and the side for reading is on the other FPGA.

In each of the two FPGAs, one side of the FIFO is connected to the application logic and the other side interacts with the protocol's logic. Hence, the application logic on the transmitting side writes data into a FIFO and the application logic on the receiving side reads data from another FIFO. The protocol is responsible for moving the data from the FIFO on the transmitting FPGA to the FIFO on the receiving FPGA. The protocol's flow control ensures that the FIFO on the destination FPGA never gets full.

The protocol can serve multiple FIFOs in both directions. A fair data transmission scheduler ensures efficient utilization of the MGT's bandwidth. The data in the transmitter side's FIFO is consumed soon enough, so this FIFO won't become full (as long as the bandwidth allows this and the data is consumed from the FIFO on the other side).

The protocol also has a different interface for transmitting packets (with the help of an EOP port).

A half-duplex option is also supported. In this case, the protocol ensures that all data that arrives is correct. If a bit error occurs on the physical channel, the data flow is halted before faulty data reaches the application logic.

Gigabit Ethernet

Even though Gigabit Ethernet is intended for communication between computers, it's possible to use this simple protocol for sending packets between two FPGAs. A suitable IP is usually provided by the FPGA manufacturer. The application logic is hence responsible for creating and receiving Ethernet packets through one of the standard interfaces (GMII, RGMII, XGMII etc.).

As with any Ethernet link, it's the application logic's role to organize the data into packets, as well as handling errors on the link (e.g. with a retransmission).

Interlaken

Interlaken is an open protocol for communication between chips. It is based upon 64b/67b encoding: The low-level data flow consists of repeatedly transmitting segments of 64 bits. Before each such segments, 3 bits are added in order to distinguish between application data and control words.

The protocol's basic transmission unit is a burst having a variable length. A control word is transmitted immediately before and after each such burst. These two control words contain information that allows an abstraction of packets and channels. This includes, among others:

Channel number: A number between 0 and 255, associating the data burst that follows with a channel.
SOP: A flag indicating that the data that follows is the beginning of a packet.
EOP: A flag indicating that the burst before the control word was the last data of a packet. The EOP also indicates the number of valid bytes in the burst and if the packet contains errors.

The content of the control word and the data burst that (possibly) came before it is checked for bit errors with a CRC24.

The Interlaken protocol also has a CRC32 check for diagnostic purposes (once for each Meta Frame). However, if an error is detected with the help of this test, it is related to large segment of data and not to a specific burst or packet. In other words, if data passed the CRC24 test despite a bit error, this will be detected only later, and without being able to point at the burst that contains faulty data.

Interlaken also allows sending flow control requests with two mechanisms: In-band flow control and out-of-band flow control (OOBFC). Both mechanisms consist of a means for the receiver to advertise if it's ready to receive data, with an XON/XOFF semantics. This information consists of a single bit for each receiving entity. This bit is '1' when this entity is ready to receive data, and '0' otherwise. The exact meaning of "receiving entity" is application-specific.

The in-band flow control mechanism uses 16 bits in the control word in order to transmit the flow control requests. The out-of-band flow control mechanism requires three additional physical wires (clock, sync and data) for transporting the information.

Interlaken doesn't include any arbitration or transmission scheduling mechanism, and can therefore not enforce flow control. It is up to the application logic to arbitrate the transmission of data from difference sources as well as to ensure that the flow control requests are obeyed. When an Interlaken IP core announces that it supports flow control, it often means that it allows sending flow control requests, rather than controlling the flow of data.

If a burst fails the CRC24 test (i.e. a bit error is detected), a retransmission can be requested as defined in the Interlaken Retransmit Extension Protocol Definition. This retransmission request is sent on the out-of-band interface (OOBFC) mentioned above. In other words, the protocol doesn't define a way to send the retransmission request on the MGT link itself, but rather on the three separate physical out-of-band wires.

Aurora

Aurora is a protocol developed by Xilinx (now AMD) for its own FPGAs. This protocol's basic transmission unit is a single word (having a fixed width depending on the number of MGTs involved). However, the protocol also supports a packet mode, by optionally exposing a "last" signal that is passed through the channel along with the last word of a packet.

Bit errors on the physical channel are not corrected by the protocol.

On the transmitting side, both sides (the protocol and application logic) may throttle the data flow by virtue of handshake signals. On the receiving side, the application logic must always accept any data word that arrives. However, the protocol supplies two options for preventing the transmitter from sending data:

Native Flow Control (NFC) : The application logic on the receiving side requests the transmitting side to pause the data transmission for a number of data slots that is given in the request. The protocol's logic is responsible for obeying this request by virtue of handshake signals on the transmitting side.
User Flow Control (UFC) : The protocol allows the receiving side to send an arbitrary message through a separate channel. The application logic on the transmitting side is responsible for processing and obeying this request.

As both flow control mechanisms are based upon the data link in the opposite direction, these are available only in full duplex mode.

When the protocol is used to transmit packets, the transmitter optionally appends a CRC to the end of each packet (in Xilinx' implementation of the protocol). The protocol's implementation checks this CRC in the receiving side, and informs the application logic if an error was detected in the packet. If Aurora is used without packets (without a "last" signal), error detection is not made by the protocol.

Any retransmission mechanism, multiple channel multiplexing and transmission scheduling must be implemented by the application logic.

There are two variants for Aurora: With 8b/10b encoding and with 64b/66b encoding. 64b/66b encoding is more efficient, so this variant should be preferred when possible.

Serial Lite

Altera has a series of protocols and IP cores sharing the name Serial Lite:

SerialLite II: Packet-based or non-packet interface for receiving and transmitting data (named "Atlantic Interface"). Supports retransmission in response to bit errors. Applicable to early FPGAs only (ranging from Arria II to series-V FPGAs).
Serial Lite III: Supports Continuous and Burst mode. In Continuous mode, the data stream can flow uninterrupted and without gaps between the transmitter and receiver. This requires that both sides rely on the same reference clock. This protocol is internally based upon Interlaken, but without supporting channels or SOP/EOP. Accordingly, the interface's data width is 64 bits multiplied with the number of MGTs used. Bit errors on the physical link are reported as a diagnostic event with respect to the MGT that caused them, but not in relation to a specific data segment. Applicable to series-V FPGAs and series-10 FPGAs.
Serial Lite IV: Based upon Avalon streaming interface with signals for start and end of packets (MAC). A CRC is optionally inserted at the end of packets by the transmitter and checked by the receiver. There is also a Basic mode, without division into packets. Applicable to Stratix 10 and Agilex E-tile.

The IP cores are not compatible across different members of this series of protocols. Only SerialLite II may initiate retransmissions.

RapidIO

RapidIO is a packet-based protocol that is similar to PCIe in the sense that the packet types it supports correspond to operations required by a CPU, among others:

Write: Writes data to an address. There are several variants to this operation, one of which requires a response indication that the operation has been completed.
Read: A request to read data from an address. A response packet is sent by the target to this request.
Atomic read-modify-write: Request to write data to an address after reading the previous value, as an atomic operation. Among the supported operations: Atomic increment and atomic decrement, atomic swap, compare and swap etc.
Several maintenance requests for discovery, control and status. Similar to the PCI protocol's configuration register space, these maintenance requests access registers capability registers (CAR) and command and status registers (CSR).

The main functional difference between RapidIO and PCIe is that the PCIe protocol requires that a central unit (a Root Complex, usually a CPU) configures all endpoints in the system. A RapioIO system doesn't require a central unit of this sort.

Another difference with PCIe is that retransmission of packets is optional. A retransmission occurs only for packets sent as "reliable traffic" (RT). Packets can also be sent as "continuous traffic" (CT). Such packets are neither acknowledged nor retransmitted.

The RapidIO protocol defines all aspects of the communication, from the electrical specification to packet formats, retransmission and flow control.

RapidIO may not be an attractive candidate for a simple point-to-point connection between two FPGAs, due to the complexity of this protocol. RapidIO may be more suitable as an interconnect between several FPGAs through a switch, in particular if PCIe is unsuitable because of its need for a CPU in the system.

Summary

Several protocols for transmitting application data were presented. Each protocol presents a different methods for implementing the transport of data, controlling its flow and responding to bit errors (if at all). The correct protocol for an application is a balance between the features that the protocol offers and the efforts needed for implementing the missing parts in the application logic.

This wraps up the second page in this series about MGTs. The next page introduces some encoding methods often used with MGTs.