ECC on TX2

Hi Guys,

I see from the TX2 data sheet that ECC cache supported on both, ARM A57 and Denver cores. Does the SW support this? Do we have ECC error report? We considering space radiation test for TX2 so we need to monitor ECC errors during the test.

Regards,
Dan

hello Berkutok,

you could check ECC is enable or not by below command.
dmesg | grep -i ecc

here’s code for your reference, thanks
$TOP/kernel/t18x/drivers/platform/tegra/mc/mcerr_ecc_t18x.c

Thank you Jerry,

I will do that

Dan

Hi Jerry,

The output from “dmesg | grep -i ecc” below.
I see that DRAM ECC disabled and A57 ECC enables (I assume that this is cache ECC).

Can you please advice how to:

  1. Enable DRAM ECC
  2. Enable Denvers Cors ECC

root@tegra-ubuntu:/home/nvidia# dmesg | grep -i ecc
[ 0.447346] ramoops: attached 0x200000@0x100000000, ecc: 0/0
[ 0.897878] dram-ecc: DRAM ECC disabled-MC_ECC_CONTROL:0x0000000c
[ 2.951723] **** A57 ECC: Enabled
[ 12.011804] systemd[1]: systemd 229 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN)
root@tegra-ubuntu:/home/nvidia#


Thank you and best regards,
Dan

hello Berkutok,

Denver cores ECC was enabled by default, please refer to below code snippet to check Denver ECC status, Denver’s L2 control register was same as A57, Denver cpu id is core-1 and core-2.
$TOP/kernel/t18x/drivers/platform/tegra/tegra18_a57_serr.c

cpu = smp_processor_id();
	cpuinfo = &per_cpu(cpu_data, cpu);

	if (MIDR_PARTNUM(cpuinfo->reg_midr) == ARM_CPU_PART_CORTEX_A57) {
		ecc_settings = read_l2ctlr();
		pr_info("**** A57 ECC: %s\n",
			(ecc_settings & A57_L2CTLR_ECC_EN) ? "Enabled" :
			"Disabled");
		core_type = "A57";
	}

	pr_info("%s: on CPU %d a %s Core\n", __func__, cpu, core_type);

i’ll going to check how to enable DRAM ECC for Tx2 and get back to you.
thanks

Thank you Jerry, this is very helpful.
I am waiting for DRAM ECC check.

Dan

hello Berkutok,

DRAM ECC is not support for TX2.
are you able to read Denver ECC status with above method?
thanks

No DRAM ECC? So much for the TX2 being a reliable system. :/ Please reconsider for future “dev” board designs.

Yeah, why would the cache support ECC but the DRAM not?
Traditionally, SRAM has an easier time to hang on to the bits than DRAM…
It makes one wonder: Did they add ECC because they were/are hunting some scary bug?