1 Introduction

The Programmable Logic (PL) of the FPGA provides the flexibility to move data using AXI Masters such as DMA or custom IP. When Xen is running on ZynqMP the SMMU is used control data movement from the PL to any guest domains. This page describes the details needed to make an AXI Master in the PL work with a guest running in Xen. There is a focus on bare metal guests running on Xen with testing based on the 2017.1 release.

This information is based on early prototyping such that there are still some remaining questions and changes are likely. This page also assumes the user understands Xen and the process of passing through a device to a guest domain.

2 TBUs

The SMMU implements a TBU for sets of masters based on the PS port. Refer to the Zynq MPSOC TRM for more TBU details. The following table specifies the TBU number associated with each port of the Zynq MPSOC PS.
TBU Number
PS Port

3 Stream IDs

Many of the complexities of a PL Master revolve around stream IDs. The details are explained in the following sub-paragraphs.

3.1 AXI IDs and Master IDs

The AXI protocol includes two kinds of AXI transaction identifiers, AXI IDs and Master IDs. The Master ID is used to uniquely identify the master that initiated a transaction. AXI IDs are used to identify separate transactions that must be processed in order, for example when having multiple contexts or threads within a single master.

3.2 Derived Master IDs

On the ZynqMP, some masters have fixed Master IDs. Others have their Master IDs derived by combining a fixed part with AXI IDs generated from the master itself. This may for example allow blocks to differentiate DMA read channels from write channels just by looking at the AXI Master ID.

For transactions crossing from the PL into the PS, at each port a Master ID will be derived from a fixed part combined with the AXI ID. So for multiple PL masters, using the same PL to PS AXI port, the AXI ID is the part that uniquely identifies the PL masters.

For Masters in the PS, partially derived Master IDs are created immediately outside of the respective master. For the PL, this derivation happens at the AXI port at the PL to PS boundary. This has some side-effects that need to be considered.

When connecting multiple masters to slaves using an interconnect, depending on the implementation of the interconnect and its ability to handle out of order traffic, the interconnect may compress the AXI IDs to optimize away unneeded signals. Therefore, when instantiating interconnects in the PL, AXI IDs may be compressed by the interconnect and it may become difficult or even impossible to identify individual PL masters once transactions reach the PL to PS port.

When a transaction reaches the SMMU, the SMMU will derive a Stream ID from the AXI transaction in an implementation specific way. On the ZynqMP this is done by combining the AXI Master ID with the TBU number.

3.3 PL Stream IDs

The Master ID for each AXI master and each port of the PS for the PL are listed in a table titled "Master IDs List" in chapter 16 of the Zynq MPSOC Technical Reference Manual. Each PS port to the PL has a fixed master ID that is used together with the TBU number and an AXI ID to create the stream ID.
TBU Number Bits
Master ID Bits
The following example illustrates the stream ID of 0x200 for the first AXI master of the HPC0 port of the PS.
TBU Number Bits
Master ID Bits

3.4 Stream IDs in the Device Tree

For Xen based systems the stream IDs of PL masters must be added to the Linux Dom0 device tree. Automated device tree generation does not generate the stream IDs for the PL masters so that the user must add them into the device tree manually. The device tree is used by both Xen and Linux as Dom0 and DomU guests. Linux 4.9 has moved to newer bindings while Xen has not as of 2017.1 and this can be confusing.

3.4.1 Xen Specific Device Tree Bindings

Xen uses old bindings which include #stream-id-cells at each AXI master node and mmu-masters at the SMMU node. These are required for any DMA device passed through by Xen, regardless of whether it’s for Linux or bare-metal guests.

Linux 4.9 has deprecated the mmu-masters property in the smmu node, and if it sees one, it will disable the SMMU driver. This is the reason that a native Linux device tree does not include the
mmu-masters property in the smmu node.

The SMMU device node mmu-masters property contains the stream IDs for each AXI master. The device tree snippet below contains two CDMA IP cores (PL AXI masters) connected to HPC of the PS.

        &smmu {
            status = "okay";
            mmu-masters = < &gem0 0x874
                &gem0 0x874
                &nand0 0x872
                &axi_cdma_0 0x200
                &axi_cdma_0 0x201

3.4.2 Linux Specific Device Tree Bindings

The iommus property within a node is used by Linux natively (without Xen) as of 4.9. The iommus property is contained in each node that is an AXI master.
Xen does not know about the iommus property and requires the mmu-masters property to use the SMMU.

        gem0: ethernet@ff0b0000 {
            #address-cells = <1>;
            #size-cells = <0>;
            #stream-id-cells = <1>;
            iommus = <&smmu 0x874>;

3.5 Multiple Stream IDs in the PL

When there are multiple masters in the PL, such as two CDMAs connected to an AXI Interconnect then connected to the HPC0 port of the PS, the AXI Interconnect creates the AXI IDs that map into the stream ID. At the current time there is not a clear way to determine the AXI ID of a specific master. It is also not clear yet if the AXI ID of a master might change when system changes are done in Vivado.

3.6 Determining the Stream ID of an AXI Master

The dtdev line of the Xen guest domain configuration file is used to setup the SMMU for the device that is being passed through to a guest. It is only required for AXI masters rather than AXI slaves. When this line is left out of the configuration file for an AXI master Xen will generate a fault when the AXI master performs a transfer as illustrated below.
root@plnx_aarch64:/mnt# (XEN) smmu: /amba/smmu@fd800000: Unexpected global fault, this could be serious
(XEN) smmu: /amba/smmu@fd800000: GFSR 0x80000002, GFSYNR0 0x00000000, GFSYNR1 0x00000201, GFSYNR2 0x00000000
The fault indicates an unidentified stream ID has been received by the SMMU as shown in the GFSR register. The GFSYNR1 register contains the stream ID that was unidentified. A stream ID of 0x201 was unidentified in the above fault. The description of the SMMU registers is contained in the ARM System Memory Management Unit Architecture Specification.

4 Coherent Transfers

There are only minor details that differ for Xen with coherency. As with a baremetal application without Xen, the following must be done for coherent transfers with an AXI master in the PL for Xen
  1. The memory, typically DDR, must be outer shareable
  2. Coherent transactions must be generated by the AXI Master by setting the AxCACHE signals to a coherent value, typically 0xF or 0xB
  3. The AXI master must be connected to a coherent interface (port) on MPSOC such as HPC0 or HPC1

The CCI of the ZynqMP requires that snooping be enabled for the APU to support coherency. This is accomplished by ATF in a Linux based system and a Xen environment such that the bare metal application should not attempt to alter the CCI register as the register cannot be altered from EL1.

5 Non-Coherent Transfers

There is no difference for non-coherent transfers as the bare metal application is still responsible for cache operations.

6 Central Direct Memory Access (CDMA) Prototype Testing

The CDMA IP core was used extensively for testing with one or two instances. It provides an easy prototyping platform when used without scatter gather for simple transfers. The hardware system was configured for simple transfers such that each CDMA IP core only requires one AXI master. The goals of the prototyping were to show working AXI masters in the PL that are interfacing with baremetal applications running on top of Xen.

Note: For multiple CDMA IP cores the IP core should also be configured to use the store and forward feature to allow both IP cores to perform transfers simultaneously. Without this feature transfer hangs will be created when both cores transfer simultaneously. Larger transfers with CDMA were desired for testing. The CDMA has a maximum transfer length of 2 ^ 23 - 1 bytes.

6.1 Xilinx SDK Details

The SDK bare metal application built on top of a hypervisor only allocates 1 MB of DDR for the application. It can be safely increased in the linker script for the application making sure that the amount of memory is not more than specified in the Xen configuration file for the guest. The drivers in the SDK for the IP core along with the example applications were used.

6.2 Hardware System Details

For more than one master in the PL connected to a single PS port (HP, HPC) an AXI interconnect is required. There are now two AXI interconnect IP blocks in Vivado. A newer AXI interconnect called AXI SmartConnect is available while the older AXI Interconnect is still available. Vivado may use the AXI SmartConnect as the default interconnect with connection automation.

6.3 AXI SmartConnect

Do not use AXI SmartConnect if there is more than one master in the PL connected to a PS port (HP, HPC) while running with the SMMU in a Xen environment. This IP does not support AXI IDs that are needed for SMMU support and Xen. AXI SmartConnect has been tested with one PL master when AXI IDs are not an issue.

6.4 AXI Interconnect

The AXI Interconnect provides the required AXI IDs needed for using the SMMU in a Xen environment when configured properly. This is the preferred solution.

6.4.1 AXI Data Width

The AXI Interconnect incorporates data sizers when the data bus width varies between connected masters and slaves. The data sizers can remove IDs from the AXI interface. The best practice is to configure the master IP and the slave IP to have the same data width such that the AXI Interconnect is configured with only a crossbar connect without any data size changes.

6.5 AXI Transactions for the SMMU

The SMMU is designed to be used in a non-secure system such that AXI transactions generated by a master in the PL must be non-secure. The master IP may not allow the type of the transaction, secure or not, to be specified such that the AXI signals will need to be tied off manually. Security is specified in the AxPROT signals with a value of 0x2 being non-privileged, non-secure, and data. There are still some questions that remain around privileged vs non-privileged transactions and early prototyping has shown non-privileged to function correctly.

Note: non-secure transactions will not work for bare metal applications that not running on Xen. The Xilinx SDK bare metal (standalone) BSP executes at EL3 which drives this requirement. If you desire to create two different configurations, one running bare-metal as a Xen DomU, and the other running bare metal directly on the hardware at EL3, (without Xen) then a GPIO can be used to control the AxPROT signals.

7 Checklist for AXI Masters

  1. Using AXI Interconnect rather than AXI SmartConnect if multiple AXI masters connected to an AXI interconnect?
  2. The AXI master generating unsecure transactions when running on Xen or secure transactions for baremetal without Xen?
  3. The AXI interconnect configured for the same data width for the slaves and masters when multiple masters are connected to a single interconnect?
  4. Using the correct stream ID in the device tree for Xen?
  5. Is the dtdev line of the xen configuration correct referring the correct node?
  6. Is the iomem line of the xen configuration correct for the address and range of the device being passed through?