This is the third post in a series of four about Partial Reconfiguration, or Dynamic Function eXchange (DFX) with Xilinx' Vivado. While the previous two show how to set up the FPGA design, this one discusses how the static logic may cope with the reconfigurable logic vanishing and reappearing.
Overview
The process of loading a partial bitstream is similar to hot swapping of physical hardware: A certain part is removed abruptly, replaced by another, and powered on. This post discusses the means necessary to ensure that this transition is made smoothly and reliably.
This post assumes that the Partial Reconfiguration is carried out by static logic on the FPGA itself. To ensure smooth operation, it should be carried out with these stages (explanations follow):
- Bring the reconfigurable logic to a safe state, so resetting it doesn't confuse anything affected by its outputs.
- Activate the reset signals for the reconfigurable logic.
- Initiate decoupling.
- Load the partial bitstream into the ICAP.
- Wait for the configuration's STARTUP sequence to finish.
- Disable decoupling.
- Deactivate the reset signals for the reconfigurable logic.
The need for decoupling
During the process of loading the partial bitstream, the connections between the static logic and reconfigurable logic are in an unpredictable state. As a result, the reconfigurable module's output ports may generate random patterns or illegal values. This doesn't necessarily occur on every output port all the time, but odds are that some kind of odd behavior will be seen.
As the static logic continues to function regardless of loading the partial bitstream, it's necessary to ignore the unpredictable signals that may arrive from the reconfigurable module, in order to avoid any adverse effects. Xilinx' user guide, UG909, refers to this as decoupling.
How to implement decoupling depends on the nature of the reconfigurable module's output ports: The influence of each of these ports on the static logic should be analyzed, and preventive actions should be taken as necessary. This could for example involve multiplexing of the output ports with neutral values during the reconfiguration, adding clock-enable to logic that should not respond, or holding some parts of the static logic in reset.
If reconfigurable logic is connected directly to I/O pads, it might be necessary to put these in high-Z mode or deactivate the I/O logic's clock enable (if that I/O logic includes an output register). Also, if the reconfigurable logic is connected to external components, it may be necessary to bring these components to a safe state before reconfiguration.
A Partial Reconfiguration Decoupler IP is available in Vivado's IP Catalog for decoupling of AXI connections.
The reconfigurable module's input ports need no such treatment — the static logic doesn't care that the signals it drives are not consumed.
Regarding Ultrascale devices, decoupling is required before loading "clearing bitstreams", since they shut down the reconfigurable logic.
A few words on the STARTUP sequence
One important part in the bitstream (full as well as partial) is the START configuration command, which kicks off the STARTUP sequence. This sequence involves a couple of mechanisms for the purpose of a consistent bring-up of the logic.
The first part is that the synchronous elements are assigned their default values at the end of the configuration. This is always true for Ultrascale FPGAs and later. On Series-7 FPGA, it's true for a full configuration (i.e. with an initial bitstream), and for Partial Reconfiguration if RESET_AFTER_RECONFIG is enabled.
Then second part is GWE (Global Write Enable, not to be confused with synchronous elements' enable inputs or write-enable inputs): This signal allows flip-flops and RAMs to change values. GWE is held low on the entire FPGA during full configuration, and changes to high at some stage during the configuration startup sequence. When loading a partial bitstream, only the reconfigured logic is affected.
Naturally, the change of GWE is asynchronous with respect to any clock that is provided by the application logic, so the timing between this change and the first effective clock is unpredictable regarding any synchronous element.
In a Partial Reconfiguration scenario, just like with a full configuration, this means that all synchronous elements will have their initial value immediately after the process has finished (with Ultrascale and later, or if RESET_AFTER_RECONFIG is set) however there is a random possibility that some synchronous elements will respond to the first application logic's clock cycle, and other elements won't. This random behavior depends on when this first clock arrives relative to when GWE changes to high. It's therefore important to apply reset properly to all logic that is sensitive to such uncertainty.
The need to reset reconfigurable logic after loading the partial bitstream is the same as after a full configuration of the FPGA. It's however more intuitive that resetting is required in the case of a full configuration, in particular because the reset is often held active until some logic elements have stabilized (e.g. MMCMs or PLLs are locked, external hardware is ready etc.).
The summarize, there is no single answer on whether to reset the reconfigurable logic, and which parts of it. Just like a full configuration, the synchronous elements are given their default values and begin responding to clocks. This is good enough in some situations, and a reset is required in other scenarios.
Detecting the end of STARTUP
In the stages listed above, the only part that is out of the application logic's control is the STARTUP sequence. It's nevertheless important to know when it's finished.
There's a description of the STARTUP sequence in the Configuration Guide for each FPGA family, but to make a long story short, the time this sequence takes depends very much on the options of the bitstream. For example, the sequence can be configured to wait for MMCMs to lock, or for DCIs to complete their impedance matching.
The FPGA supplies a signal, End Of Startup (EOS), which changes to high at the last stage of the STARTUP sequence (i.e. when this sequence is finished). Relying on EOS is the formally correct way to tell when to bring the reconfigurable logic back, by initiating a reset and start the recoupling of the output ports.
The EOS signal is available only from inside the logic fabric, with an instantiation of a STARTUPE2 primitive, possibly as follows:
wire eos;
STARTUPE2 #(.PROG_USR("FALSE")) startup_ins
(
.CLK(1'b0),
.GSR(1'b0),
.GTS(1'b0),
.KEYCLEARB(1'b1),
.PACK(1'b0),
.USRCCLKO(1'b0),
.USRCCLKTS(1'b0),
.USRDONEO(1'b1),
.USRDONETS(1'b1),
.CFGCLK(),
.CFGMCLK(),
.EOS(eos),
.PREQ());
So when the bitstream has finished loading into the ICAP, wait for EOS to become high, and then begin reset and recoupling.
Ultrascale FPGAs have a STARTUPE3 primitive instead, however Vivado accepts STARTUPE2 primitives for these FPGAs too, and translates them correctly into STARTUPE3. So the code example above covers all FPGA families.
I made a few anecdotal tests to measure how long it takes for EOS to change to high after the arrival of the bitstream's START command to the ICAP.
With Kintex-7, with default settings of the bitstream, this took 26 clock cycles (at 100 MHz, hence ~260 ns). Since there were additional data in the bitstream, among others NOPs, it's quite possible that EOS changed to high very soon after the last word of the bitstream was fed into the ICAP.
But the same test with a Kintex Ultrascale FPGA yielded completely different results: It randomly took EOS somewhere between 0.8 ms to 4.5 ms to change to high after the START command.
Even though it's quite easy to use the STARTUPE2 primitive as shown above, it's also possible to kick off the recoupling after a fixed time after the bitstream has finished to load. For example, it's quite unthinkable that the STARTUP sequence would take as long as 100 ms, and yet it's a practically unnoticed delay for a human. However, considering the two test results given above, it seems like using the STARTUPE2 primitive is the safe way to go.
As for Ultrascale FPGAs, EOS' behavior after loading a clearing bitstream is apparently not documented. But in an anecdotal test it remained low after loading the clearing bitstream, and changed to high after the partial bitstream that was loaded afterwards.
The third post ends here. The last post goes behind the scenes of how Vivado handles the relationship between the static logic and reconfigurable logic by virtue of OOCs and DCPs, and how understanding this opens for a reliable way to produce partial bitstreams for the Remote Update scenario.