Google FeedBurner is phasing out its RSS-to-email subscription service. While we are currently working on the implementation of a new system, you may experience an interruption in your email subscription service.
Please stay tuned for further communications.
Get email delivery of the Cadence blog featured here
At CadenceLIVE Americas in June, Raghukul presented on Thunder-Bus, a low-latency bridge between Xilinx FPGAs to accelerate system validation. Since Cadence's Protium X1 and X2 are also built using Xilinx FPGAs, Thunder-Bus can be used to connect Protium to standard Xilinx evaluation boards. Raghukul is the architect and designer of Thunder-Bus.
One attractive way to validate IP is using emulators such as Palladium Z1 or Z2. However, there are also some issues. One, of course, is price. But there are also issues with external I/O interfaces, and with benchmarking since emulators are slower than the actual FPGA fabric. One attractive approach is to use commercially available Xilinx boards, such as the ZCU102, which offer FPGAs with Arm cores and AXI fabric interfaces. For low and medium gate-count validation, this is probably ideal.
However, for multi-million gate IP prototyping, the design will not fit on a single board. This creates challenges to partitioning the design and handling clock and debugger synchronization across multiple boards. One contributor to the solution is Thunder-Bus, which provides chip-to-chip connectivity between Xilinx FPGAs. It supports anywhere from 0.5Gb/s to 26Gb/s between the arrays, and also supports Protium (since it is constructed with Xilinx arrays). Thunder-Bus seamlessly connects different types of Xilinx arrays (such as UltraScale and UltraScale+). Thunder-Bus is full duplex, allowing two arrays to communicate between themselves in both directions simultaneously. It is currently in what Raghukul calls "beta+" pre-production.
The above slide shows an example of how Thunder-Bus can be used. The diagram in the middle shows the two IP blocks (in pink). The design is partitioned into two standard FPGA boards with the connectivity all being handled by Thunder-Bus.
Here's a more concrete example of a Versal AIE (artificial intelligence engine) connected to a Protium S1 using Thunder-Bus. The S1 contains 8 FPGAs and they are approximately 60% utilized. Thunder-Bus is running with 4 lines at 10Gb/s line rate streaming traffic between cores and processor. This configuration was used for all IP validation and development of the software stack.
In the future, Xilinx is looking towards having x86-based validation, running QEMU on the x68 to run Arm code, and connect through to the Protium system that is holding the design in its FPGAs and connected with Thunder-Bus.
Key takeaways: We can use this technology with commercially available prototype boards. The performance using Thunder-Bus is better than existing chip-to-chip solutions. The asynchronous nature of Thunder-Bus means that clocks do not need to be synchronized between different arrays. If necessary, additional custom boards can be added to extend I/O connectivity, which helps in early compliance and interoperability testing, which creates the ideal platform to be used for developing the full software stack.
For more information about Protium, see my Breakfast Bytes blog posts:
This work was done using the older Protium S1 platform. The current model of Protium is the Protium X2 (on the left in the picture below, to the right is the X1). For full details, see the Protium Prototyping page.
Sign up for Sunday Brunch, the weekly Breakfast Bytes email