How To:


Follow the steps below to implement subsystem restart for 2017.4 release.

Note: If none of the peripheral idling, recovery or escalation features are needed then there is no requirement to define the subsystem in isolation configuration.
PS only and System restart can be triggered from Linux directly. Though APU subsystem restart without peripheral idling is not recommended and can have adverse effect.

An Example


Petalinux project can be created with base bsp for specific board. The 2017.4 Petalinux release includes xilinx-zcu102-warm-restart-2017.4.bsp which is a subsystem restart example bsp.
Create a Petalinux project using this bsp results in a design which demonstrates many aspects of subsystem restart.
xilinx-zcu102-warm-restart-2017.4.bsp is available to customers through lounge upon request.

The Vivado project file xilinx-zcu102-warm-restart-2017.4.xpr is included with the bsp and can be found in the hardware directory after petalinux-create with xilinx-zcu102-warm-restart-2017.4.bsp


In addition to the features supported by the Xilinx base bsp, warm restart bsp includes the following features:
  1. An warm restart hdf in which defines three subsystems, APU, RPU0 and RPU1. Details on subsystems can be found here.
  2. Peripheral node Idling and reset. Details can be found here.
    • Note: There are no custom hooks for the PL peripheral idling in the example.
  3. Recovery with healthy bit escalation scheme.
    • Recovery and escalation are enabled with build flags in pmu recipe and warm restart flag in ATF.
    • Includes application(wdt-heartbeat) to provide heartbeats to the fpd wdt.
  4. Enable XMPU / XPPU based configuration only for DDR and OCM with flag in FSBL recipe. No XMPU / XPPU protection for any other components.
  5. Helper script (installed through mpsrm-init) to facilitate setting /unsetting of healthy bit, starting/stopping of wdt-heartbeat application.
  6. RPU application support. Application recipe to copy prebuild RPU application for r5-0 and r5-1 in the /lib/firmware directory of the rootfs
  • Note: The devicetree changes needed for the R5 application is not needed in meta-user as default device tree generate with rpu subsystem will generate the necessary overlay.

Building Petalinux Project

Here are the minimal steps to simple pickup the example bsp and build it. See MPSoC Petalinux Software Development for further details.
http://www.wiki.xilinx.com/MPSoC+Petalinux+Software+Development
$ petalinux-create -t project -s <path to warm-restart bsp> -n my_project
 
$ cd my_project
 
# do the modification if needed for various petalinux components as needed.  The design will work without any modifications.
 
$ petalinux-build

The following figure lists the directory structural differences between meta-users in warm restart bsp compare to base bsp . The files added are marked as green and the files modified are marked as blue. The added files implements the subsystem restart features. For details on the actual modification refer to Zynq Ultrascale Plus Restart Solution,

Meta User Directory Tree

meta-user
├── recipes-apps
│   ├── gpio-demo
│   │   ├── files
│   │   │   ├── gpio-demo.c
│   │   │   └── Makefile
│   │   └── gpio-demo.bb
│   ├── openamp-fw
│   │   ├── files
│   │   │   ├── data
│   │   │   │   ├── r50_app
│   │   │   │   ├── r50_led
│   │   │   │   ├── r51_app
│   │   │   │   └── r51_led
│   │   │   └── LICENSE
│   │   └── openamp-fw_1.0.bb
│   ├── peekpoke
│   │   ├── files
│   │   │   ├── Makefile
│   │   │   ├── peek.c
│   │   │   └── poke.c
│   │   └── peekpoke.bb
│   └── wdt-heartbeat
│       ├── files
│       │   ├── Makefile
│       │   └── wdt-heartbeat.c
│       └── wdt-heartbeat.bb
├── recipes-bsp
│   ├── arm-trusted-firmware
│   │   └── arm-trusted-firmware_%.bbappend
│   ├── device-tree
│   │   ├── device-tree-generation_%.bbappend
│   │   └── files
│   │       ├── multi-arch
│   │       │   ├── zynqmp-qemu-multiarch-arm.dts
│   │       │   └── zynqmp-qemu-multiarch-pmu.dts
│   │       ├── openamp-overlay.dtsi
│   │       ├── system-user.dtsi
│   │       ├── xen-overlay.dtsi
│   │       ├── xen-qemu-overlay.dtsi
│   │       └── zynqmp-qemu-arm.dts
│   ├── fsbl
│   │   └── fsbl_%.bbappend
│   ├── pmu
│   │   └── pmu-firmware_%.bbappend
│   └── u-boot
│       ├── files
│       │   ├── bsp.cfg
│       │   └── platform-top.h
│       └── u-boot-xlnx_%.bbappend
├── recipes-core
│   ├── images
│   │   └── petalinux-image.bbappend
│   └── mpsrm-init
│       ├── COPYING.MIT
│       ├── files
│       │   └── userhook.sh
│       └── mpsrm-init_1.0.bb
└── recipes-kernel
    └── linux
        ├── linux-xlnx
        │   └── bsp.cfg
        └── linux-xlnx_%.bbappend


  • mpsrm-init is optional recipe to add an hook script to the rootfs, which allow to start / stop wdt-heartbeat application and also provide argument based healthy bit setting un-setting.
  • petalinux-image.bbapend file allows to add applications like openamp-fw and mpsrm-init to be added to rootfs.
  • Please refer to UG 1144 Petalinux Reference Guide for more details on Petalinux project, how to build and use the packaged images.

How To Run


Resets Triggered from Linux


Following commands can be executed from Linux prompt to perform various subsystem resets.

APU only Subsystem reboot:
# set_reboot apu
# reboot
PS Only reboot:
# set_reboot ps
# reboot
System wide reboot (default behavior without set reboot call)
# set_reboot sys
# reboot
Following commands can be used to load/stop/replace RPU images in /lib/firmware. The images must be present.
To start rpu0 application
# echo <firmware name>  >  /sys/class/remoteproc/remoteproc0/firmware
# echo start > /sys/class/remoteproc/remoteproc0/state
 
To start rpu1 application
# echo <firmware name>  >  /sys/class/remoteproc/remoteproc1/firmware
# echo start > /sys/class/remoteproc/remoteproc1/state
 
To stop rpu0 application
# echo stop > /sys/class/remoteproc/remoteproc0/state
 
To stop rpu1 application
# echo stop > /sys/class/remoteproc/remoteproc1/state

Testing Recovery and Escalations:


In many use cases, recovery and escalation are turned on as they are critical for getting back into a healthy state when the system is hung.
These features can be tested by stopping the heartbeats to APU causing the watchdog to timeout so that the automatic system recovery behavior can be observed.

Normal Healthy System


In a recovery enabled system, after Linux boots up it needs to perform the following to mark itself healthy and alive. This function is provided by the mpsrm-init recipe.

1) Mark the healthy bit
# echo “0x20000000 0x20000000” >/sys/devices/platform/firmware/ggs0”
2) Start the wdt-heartbeat daemon to provide heartbeats to FPD wdt by writing key (0x1999) to FPD WDT restart offset (0xFD4D0008)
 # wdt-heartbeat &
Both of this task should be done from some startup application or startup script (e.g /etc/rcS.d/S*/)

Hang the System


We can emulate hanging of the system by killing the wdt-heartbeat application which will result in watchdog expiration. That in turn will trigger recovery attempts by the PMU-firmware.
# killall wdt-heartbeat
Killing the application that is providing the heartbeat is just one of many way to hang the system. Another possibility is to stop boot in u-boot will also result in an emulated system hang.

Triggering Escalation


Escalation is triggered when the first recovery attempt fails. The following two ways are possible methods to trigger escalated recovery.
  • Build ATF without ZYNQMP_WARM_RESTART flag or set the value of the flag to 0. At run time, force the system to hang by killing wdt-heartbeat. During the first recovery, PMU firmware will restart the watchdog, then raise an IPI to ATF which will not handle because warm restart flag is disabled. The watchdog timer will expire again since no entity is kicking it. PMU firmware will escalate to the system reset on the second watchdog timeout. See escalation flowchart in Zynq Ultrascale Plus Restart Solution wiki page.
  • Force the system to hang by killing wdt-heartbeat. This will result in an apu-only subsystem restart. After the apu subsystem recovers back into Linux, do not set healthy bit. This can be done by writing healthy bit to 0. Then kill the watchdog again. This emulates an unsuccessful recovery into Linux. Consequently, on the second watchdog expiration, PMU firmware will escalate to the next level of restart. See Healthy Scheme Escalation Flowchart in Zynq Ultrascale Plus Restart Solution wiki page

Startup Helper Script


The warm-restart example bsp also includes a helper script named userhooks.sh that is launched at the Linux boot as a startup script.
The location of the script is /etc/int.d/userhook.sh.

This script does following task at the startup after Linux boot process:
  1. Set the Healthy bit
  2. Start wdt-heartbeat application as daemon.

The script can be used to change the state of healthy bit and wdt-heartbeat application manually as well.

Download link for userhooks.sh



Following are the description of various argument that userhook.sh script can take
# Description:       This script runs the daemon for providing heartbeat
#                    to FPD wdt, failing which PMU will trigger restart
#                    escalation. This script also mark the current boot
#                    as healthy by setting healthy bit.
#             Following arguments can be passed to this script:
#            start --> Set the healthy bit and start wdt heartbeats
#            stop --> Stop wdt heartbeats
#            restart --> stop followed by start
#            set_healthy -> mark healthy status as true
#            unset_healthy -> mark healthy status as false
#            start_hb --> start wdt heartbeats
#            stop_hb    --> stop wdt heartbeats

Here are some example to run the script:

1) Manually Set healthy bit
# /etc/init.d/userhook.sh set_healthy

2) Manually start wdt heartbeat application
# /etc/init.d/userhook.sh start_hb

3) Stop/kill wdt heartbeat application
# /etc/init.d/userhook.sh stop
    or
# /etc/init.d/userhook.sh stop_hb
 

4) Unset / Clear healthy bit
# /etc/init.d/userhook.sh unset_healthy