Bus bridges and other odds and ends
The bus components and bridges within this repository are unique in that they are all designed for 100% throughput with no throughput overhead. They are also unique in that the vast majority of the cores within have all been formally verified.
Where the protocol allows it, such as with AXI4, AXI-lite, and Wishbone B4 pipelined, multiple transactions may be in flight at a time so that protocol handling doesn't stall the bus.
This is uncommon among AXI4 implementations and almost unheard of in the example AXI-lite implementations I have examined.
Most AXI4 implementations will process a single burst transaction packet at a time and require some overhead to make this happen. Xilinx's AXI-lite implementations, both interconnect and slave implementations, only handle one request at a time. Other buses, such as Wishbone Classic, AHB, or APB, will only ever process one transaction word at a time.
If you are coming from AXI4, AXI-lite, or any one of these other bus implementations to the AXI4 or even AXI-lite components supported here, then you should expect to see a throughput increase by using one (or more) of the cores listed here--given of course that you have a bus master capable of issuing multiple requests at a time.
This performance improvement may be as significant as a 16x speedup when toggling an I/O, a 4x speedup when comparing this slave against Xilinx's block RAM memory controller (when processing single beat transactions), or as insignificant as 2% improvement from using the AXI4 MM to Slave converters (according to Xilinx's data sheets---I haven't yet run the test myself). This increased performance extends to the crossbar implementations contained within this repository as well, and so you may notice the improvement only increases when using these crossbars.
Built out of necessity, this repository was originally built around a Wishbone (WB) to AXI4 bridge, which is designed to provide a conversion from a (simpler) pipelined wishbone bus to an AXI4 bus for the purposes of driving memory transactions through Xilinx's SDRAM controllers. The WB->AXI bridge is designed to connect a wishbone bus to an AXI bus which may be wider--such as from a 32-bit WB bus to a 128-bit AXI bus. Hence, if the Memory Interface Generator DDR3 controller is running at a 4:1 clock rate, memory clocks to AXI system clocks, then this bus translator should be able to accomplish one transaction per clock at a sustained (pipelined) rate (neglecting any stalls due to refresh cycles).
Since the initial build of the core, I've added the WB to AXI lite bridge. This is also a pipelined bridge, and like the original one it has also been formally verified.
While not the original purpose of the project, it now has both AXI-lite to WB and AXI to WB bridges. Each of these bridges comes in two parts, a read and write half. These halves can be used either independently, generating separate inputs to a WB crossbar, or combined through a WB arbiter.
The AXI-lite to WB bridge has been both formally verified and FPGA proven. This includes both the write half as well as the read half. Given the reluctance of the major vendors to support high speed AXI-lite interfaces, you aren't likely to find this kind of performance elsewhere.
The AXI to WB bridge write and read components have only been formally verified through about a dozen steps or so. This proof is deep enough to verify most of the bus interactions, but not nearly deep enough to verify any issues associated with internal FIFO overflows.
There's also a Wishbone (pipelined, master) to Wishbone (classic, slave) bridge, as well as the reverse Wishbone (classic, master) to Wishbone (pipelined, slave) bridge. Both of these have passed their formal tests. They are accompanied by a set of formal properties for Wishbone classic, both for slaves as well as masters.
I'm now in the process of adding AXI3 bridges to this repository. These will be necessary for working with the Zynq chips, and others, that are still using AXI3. While the work is ongoing, I do have an AXI3 to AXI4 bridge available that's undergoing testing. The bridge supports two algorithms for
W*reordering, and should be suitable for most applications.
Currently, the project contains formal specifications for Avalon, Wishbone (classic), Wishbone (pipelined), APB, and AXI-lite buses. There's also a (partial) formal property specification for an AXI (full) bus, but the one in the master branch is incomplete. The complete set of AXI properties are maintained elsewhere. These properties, and the cores they've been used to verify, have all been tested and verified using SymbiYosys.
The formal properties were first tested on a pair of Xilinx AXI demonstration cores. These cores failed formal verification. You can read about them on my blog, at zipcpu.com, here for AXI-lite and here for AXI. You can find the Xilinx cores referenced in those articles here and here for reference, for those who wish to repeat or examine my proofs.
A firewall is a guarantor: given an interface, of which only one side is trusted, the firewall guarantees the other side can trust the interface. More than that, a firewall can be used to trigger an in-circuit logic analyzer: if ever the interface rules are violated, the firewall will set an ouput fault indicator, which can then be used to trigger the logic analyzer. On top of that, the firewalls below are also built with an optional reset, allowing the design to safely return to functionality after triggering. In many cases, this requires resetting the downstream (untrusted) component.
AXILSAFETY also has a mode where, once a fault has been detected, the slave is reset and then allowed to return to the bus infrastructure again until its next fault.
This core has been formally verified.
As with the AXILSAFETY example, the AXISAFETY firewall also has a mode where, once a fault has been detected, the slave is reset and allowed to return to the bus infrastructure until its next fault. Unliike the AXILSAFETY example, this one will only ever process a single AXI4 burst at a time.
This core has been formally verified.
This core has been formally verified.
This core has been formally verified.
This repository has since become a repository for all kinds of bus-based odds and ends in addition to the bus translators mentioned above. Some of these odds and ends include crossbar switches and AXI demonstrator cores. As mentioned above, these cores are unique in their 100% throughput capabilities.
This core has been formally verified and used in several designs.
Nmaster to
Mslave AXI-lite crossbar interconnect. As such, it permits
min(N,M)active channel connections between masters and slaves all at once. This core also has options for low power, whereby unused outputs are forced to zero, and lingering. Since the AXI protocol doesn't specify exactly when to close a channel, there's an
OPT_LINGERallowing you to specify how many cycles the channel should be idle for in order for the channel to be closed. If the channel is not closed, a clock can be spared when reusing it. Otherwise, two clocks will be required to access a given channel.
This core has been formally verified.
While I haven't tested Xilinx's interconnect to know, if the quality of their demonstration AXI-lite slave core is any indication, then this cross-bar should easily outperform anything they have. The key unusual feature? The ability to maintain one transaction per clock over an extended period of time across any channel pair. (Their crossbar artificially limits AXI-lite interfaces to one transaction at a time.)
While this isn't really intended to be a high performance core, it can still handle 100% throughput like most of my IP here. Therefore, anything less than 100% throughput through this core will be a test of and reflection of how the rest of your system works.
This core has been formally verified.
This core has been formally verified.
This core has been formally verified.
This core has been formally verified.
This core has been formally verified.
Unique to this (full) AXI crossbar is the ability to have multiple ongoing transactions on each of the master-to-slave channels. Were Xilinx's crossbar to do this, it would've broken their demonstration AXI-full slave core.
This core has been formally verified and used in several designs.
This core has been formally verified.
Nstream inputs, select from among them to produce a stream output. Guarantees that the switch takes place at packet boundaries. Provides an AXI-lite interface for controlling which AXI stream gets forwarded downstream.
This core has been formally verified.
This core has been formally verified. While not used in any designs per se it has formed the basis for many AXI-lite designs.
Unlike Xilinx's demonstration AXI4 slave core, this one can handle 100% loading on both read and write channels simultaneously. That is, it can handle one read and one write beat per channel per clock with no stalls between bursts if the environment will allow it.
This core has been formally verified and used in several designs.
This core has been formally verified and used in several designs.
This core has been formally verified and checked in simulation.
Both this core and the one above it depend upon all stream words being aligned to the stream.
This core has been both formally verified and checked in simulation.
This core has been formally verified and used in several designs.
This core has not yet been verified in any manner, and is likely to still contain many bugs within it until that time. Use it at your own risk.
This core has been formally verified, and used successfully in a simulation based demonstration.
This particular version can only handle bus aligned transfers.
This core has been formally verified.
You can find a demonstration of this core being used in my VGA simulator--supporting both VGA and HDMI outputs.
AXISINGLE is a (to be written) bus simplifier core along the lines of the AXILSINGLE, AXILDOUBLE and AXIDOUBLE cores, in that it can handle all of the bus logic for multiple AXI slaves while simplifying the bus interactions for each but at no throughput penalty. Once built, this will also be an AutoFPGA companion core. Slave's of type "SINGLE" (one register, one clock to generate a response) can be ganged together using it. This core will then essentially turn an AXI core into an AXI-lite core, with the same interface as AXILSINGLE above. When implemented, it will look very similar to the AXIDOUBLE core mentioned below.
AXIDOUBLE is the second AXI4 (full) companion to AutoFPGA's AXI4 (full) support. It's purpose is to simplify connectivity logic when supporting multiple AXI4 (full) slaves. This core takes a generic AXI4 (full) interface, and simplifies the interface so that peripherals can be connected to it with a minimal amount of logic. These peripherals cores can either be full AXI4 (full) cores in their own respect, subject to simplification rules discussed within, simplified AXI-lite slave as one might use with AXILDOUBLE, or even simpler than that. Key to this simplification is the assumption that the simplified slaves must never stall the bus, and that they must always return responses within one clock cycle. The AXIDOUBLE core handles all backpressure issues, ID logic, burst logic, address selection, invalid address return and exclusive access logic.
This core has been formally verified.
This core has been formally verified.
This core has been formally verified.
AXIXCLK can be used to cross clock domains in an AXI context. As implemented, it is little more than a set of asynchronous FIFOs applied to each of the AXI channels. The asynchronous FIFOs have been formally verified,
AXISRANDOM is a quick AXI stream source generating random numbers via a linear feedback shift register.
There are now two APB cores in this repository:
This core has been formally verified.
This core has been formally verified.
This repository is licensed under the Apache 2 license.
I'd like to thank @wallento for his initial work on a Wishbone to AXI converter, and his encouragement to improve upon it. While this isn't a fork of his work, the initial pipelined wishbone to AXI bridge which formed the core seed for this project took its initial motivation from his work.
Many of the rest of these projects have been motivated by the desire to learn and develop my formal verification skills. For that, I would thank the staff of Symbiotic EDA for their tools and their encouragement.