Overview of Unsupervised AMP configurations for both Zynq-7000 and Zynq UltraScale+ MPSoC.

1. Unsupervised AMP

Unsupervised AMP refers to a concept where multiple operating systems or bare-metal applications run on individual CPU cores within a CPU-core cluster without an underlying hypervisor. Additional description can be found at OpenAMP GIT

Developers hope that these unsupervised solutions will be easier to implement, provide higher performance and more flexibility than solutions which rely on hypervisors. However, as we'll discuss below; these perceptions are closely dependent on the hardware capabilities of the host processor.

2. Unsupervised AMP on Arm Cortex A9

The level of support and documentation for unsupervised AMP by ARM, Xilinx, and other silicon vendors varies as described below.

2.1 Arm (Cortex A9)

As shown by the excerpts below from the ARM Reference Manual for Cortex A9 some capabilities were incorporated within Cortex A9 IP in anticipation of unsupervised AMP configurations. Although the IP has been incorporated, the only Arm posting that we've been able to find on this subject can be found here: ARM Forum on Cortex A9 AMP




2.2 Xilinx (Cortex A9)

The Xilinx bare-metal application flow is fully supported by Xilinx-provided drivers and libraries as well as our development tools and a significant percentage of our customers deploy systems with bare-metal applications on Zynq-7000 based designs. In response to those customers who wanted AMP capabilities, we provided the 2013 application notes linked below:

From 2016 until present, Xilinx has extended our solutions to leverage the OpenAMP framework which leverages existing Linux software and services for both life-cycle management and communications between the two Zynq-7000 cores. These details may be found within the UG1186 Zynq OpenAMP Getting Started Guide

2.3 Other Silicon Vendors (Cortex A9)

In approximately 2014, a company named Embest offered an AMP solution for a subset of commercially available SoCs based on Cortex A9 that was very similar to the application notes provided by Xilinx. The engineer who did that AMP work recently communicated to us that he is available to do similar work from his current company Sihid Technology for Xilinx Zynq-7000 devices (see the page in Chinese or the English Translation from Google )

3. Unsupervised AMP on Cortex A53

Summary: Unsupervised AMP on Cortex A53 is neither recommended nor supported.

3.1 Arm (Cortex A53)

Although section 14.1.5 of the ARM Cortex-A Series Programmer’s Guide for ARMv8-A does indicate that unsupervised AMP is a valid system configuration, we are unaware of any other publicly accessible documentation, forums or information which addresses how such an AMP configuration might be designed or implemented. In addition, unlike the ARM Cortex A9 reference manual (excerpted above), the term "AMP" does not appear anywhere within the 6500 pages of the ARM v8-A Reference Manual . Finally, we are not aware of any public forums or posts regarding unsupervised AMP on ARM v8 SoCs.

3.2 Xilinx (Cortex A53)

While it may be technically possible to run AMP on individual cores of the Cortex A53 cluster for some very tightly-constrained use-cases, such configurations face very difficult and design-specific challenges. Therefore, Xilinx neither recommends nor do we support customers who wish to deploy unsupervised AMP configurations across the Zynq UltraScale+ MPSoC APU based on the following:
  • The R5 cores of the Zynq UltraScale+ MPSoC are designed for independent or lockstep operation
    • The Cortex-A53 cluster does not have such proven capability
  • Xilinx supports the standard ARM v8-A programming model where a fully featured OS such as Linux is expected to run at EL1 on ARM Trusted Firmware (ATF) at EL3.
    • Xilinx has not done any work on unsupervised AMP configurations
    • All bare-metal applications are tested and supported by Xilinx on the APU to run only on a single A53 CPU and directly at EL3 (e.g. without ATF).
    • Xilinx has not tested the execution of EL1 bare-metal applications directly on top of ATF
  • There are many possible pitfalls running multiple bare-metal applications directly on the A53 cores (cache trashing, cache operations, starvation, deadlocks, livelocks, etc)

We believe that ready access to hypervisors (See: 3rd Party Software Ecosystem) and the availability of the ARM v8-A hardware virtualization extensions address these same challenges much more simply and elegantly while bringing additional benefits related to platform maintenance as well as isolation and security.
See UG1228 UltraFast Embedded Design Methodology Guidefor additional details.

3.2.1 AMP Considerations (Cortex A53)

In recognition that some customers may still be interested to develop their own in-house, unsupervised AMP configurations, below we have highlighted some key considerations and questions that Xilinx has considered to come to our decisions above. These may assist developers to better understand the challenges and to define the shortest path to a viable system architecture.
  • Interrupts
    • What interrupt(s) are used by each application?
      • Interrupt response latency
      • I/O latency
      • Determinism
    • Interrupt Controller managment
      • Which application/OS will initialize and manage the interrupt controller?
  • Cache
    • Is cache coherency between applications/OS required?
    • What flush, invalidate modifications do the individual applications/Operating systems make to the cache controller?
    • What interactions between application/OS instances will affect cache?
  • MMU
    • How does each application use the MMU?
    • Does this application change TLB attributes?
  • Power Management
    • Does any application make changes to the power state of the device?
  • Clocking
    • Does any application make changes to the clocking in the device?
  • Peripherals
    • Which peripherals are dedicated to each application/OS?
      • Include PS, PL, and CPU-specific IP such as timers
    • Are there any shared peripherals?
      • Interrupt controller, timers, clocks, MMU, etc.
  • Cross Application/OS Communications
    • Does any application/OS communicate with another application/OS?
    • Device or cache-able memory?
  • DMA
    • Does any application use DMA?
  • System Reset(s)
    • Does any application/OS need to reset anything?
  • Dynamic Operations
    • Does the application/OS need to bring up and down the CPU
      • Even while another CPU continues to run?

4. Related Links