This page describes our linux a53 or a57 L1/L2 Cache error reporting using EDAC Framework, driver can be found in drivers/edac/cortex_arm64_edac.c

Kernel Configuration Options for Driver

The following config options should be enabled in order to build Arm cortex EDAC driver:
config EDAC_CORTEX_ARM64
    tristate "ARM Cortex A57/A53"
    depends on EDAC_MM_EDAC && ARM64
    help
      Support for error detection and correction on the
ARM Cortex A57 and A53.

Devicetree

Example:
edac {
           compatible = "arm,cortex-a57-edac";
};Testing Procedure
Prerequisites
This driver assumes that, SErrors and exception are routed to Firmware running that EL3
1. Comment the below line in the below function
cm_init_context_common(cpu_context_t *ctx, const entry_point_info_t *ep)
#ifndef HANDLE_EA_EL3_FIRST
/* Explicitly stop to trap aborts from lower exception levels. */ /* scr_el3 &= ~SCR_EA_BIT; */
#endif
The above line routes the exceptions generated at lower levels to EL3
Othere wise, at the time of error injection, we won?t get any exception
2. Give read write access permissions to the registers CPU Auxiliary Control Register, EL1
And L2 Auxiliary Control Register, these are needed to enable error injection.
Do this in Firmware running at EL3
In our case we have added that in ATF running at EL3
In bl31/aarch64/bl31_arch_setup.c
void bl31_arch_setup(void) {
val = 0x7f;
asm volatile("msr actlr_el3, %0" :: "r" (val));
asm volatile("msr actlr_el2, %0" :: "r" (val));
}

Eror injection on L1 Cache:

After linux booted run the below command
 
#echo 1 > /sys/devices/system/edac/cpu/inject_L1_Cache_Error
because of double bit error injection, cache data will corrupt causing Abort exceptions (SError) and routed to firmware running at EL3
 
Verification:
use xsdb tracing, to get call trace
Xsdb#stop
Xsdb#bt
0  0xfffeda00 sync_exception_sp_elx(): bl31/aarch64/runtime_exceptions.S, line 206
1  0xfffeda04 sync_exception_sp_elx()+4: bl31/aarch64/runtime_exceptions.S, line 206
 

Error injection on L2 Cache:

after linux booted run the below command
 
#echo 1 > /sys/devices/system/edac/cpu/inject_L2_Cache_Error
because of double bit error injection, cache data will corrupt causing Abort exceptions (SError) and routed to firmware running at EL3
 
Verification:
Xsdb#stop
Xsdb#bt
0  0xfffeda00 sync_exception_sp_elx(): bl31/aarch64/runtime_exceptions.S, line 206
1  0xfffeda04 sync_exception_sp_elx()+4: bl31/aarch64/runtime_exceptions.S, line 206
 

Known Issues and Limitations
  • NA

Changelog:

2016.3
  • EDAC: Add ARM64 EDAC

Related Links