Tegra TK1 chip_personality

Hi all!

It seems that there’s a flag in the kernel that controls the way the clocks are set during the kernel boot. Especially for Tegra TK1 there’s a flag that is called ‘chip_personality’.

In file arch/arm/mach-tegra/board.h

/* Usage Model */
enum chip_personality {
	normal = 0,
	always_on,
};

It seems that is an early_param in arch/arm/mach-tegra/common.c, so I expect that in the cmd_line this can be added either as ‘chip_personality=1’ or ‘chip_personality=0’

static int __init tegra_chip_personality(char *id)
{
	char *p = id;

	chip_personality = memparse(p, &p);

	return 0;
}
early_param("chip_personality", tegra_chip_personality);

What it really does is defined in several files, but the important one is the arch/arm/mach-tegra/tegra12_speedo.c

static void rev_sku_to_speedo_ids(int rev, int sku)
{
	int can_boost = tegra_get_sku_override();
	int chip_personality = tegra_get_chip_personality();

	switch (sku) {
...
	case 0x1F:
	case 0x87:
	case 0x27:
	case 0x24:
		cpu_speedo_id = 5;
		soc_speedo_id = 0;
		gpu_speedo_id = 1;
		threshold_index = 0;
		if (sku == 0x87 && chip_personality == always_on) {
			cpu_speedo_id = 6;
			gpu_speedo_id = 4;
		}
		break;
...
	}
}

SKU it seems to be common for all TK1s and it’s 0x87. You can see than in the boot log:

[    0.000000] Tegra12: CPU Speedo ID 5, Soc Speedo ID 0, Gpu Speedo ID 1
[    0.000000] Tegra12: CPU Process ID 0,Soc Process ID 1,Gpu Process ID 0
[    0.000000] Tegra Revision: A01 SKU: 0x87 CPU Process: 0 Core Process: 1

Also from the above log you can see that the CPU speedo ID is 5 and GPU speedo ID is 1. Which means that chip_personality defaults to 0.

Also in arch/arm/mach-tegra/tegra12_dvfs.c, there are several static structs that define some default parameters and clocks for a few ids. Therefore, in there you can see that the static struct gpu_cvb_dvfs gpu_cvb_dvfs_table defines the clocks for speedo_id =3, 4, 5, 6 and -1. These clocks are for he GPU. Also static struct cpu_cvb_dvfs cpu_cvb_dvfs_table defines the clocks for the CPU.

Now, there are quite a few TK1 chips. The Jetson board has the CD580M chip and everything works just fine. But, there are plenty chips that their part number is CD575M. The only difference in the specs it seems to be the max cpu frequency, but it seems that there’s more than that. I’ve found out that not all chips can work with the max clocks that are set by default in CD580M chips, which means that these chips may crash. Therefore, you need to lower the frequencies for the GPU and CPU.

There’s a way to do that, by limiting the frequencies in /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq and /sys/kernel/debug/clock/override.gbus/rate for CPU and GPU. But that’s not good enough because this can be overridden any time from any script or application. Therefore, the max frequencies must be limited by using the correct clocks and speedo_ids.

I’ve also found that on CD575M, when the chip_personality is set to 1 in the cmd_line then the kernel doesn’t even boot.

Therefore, the question are: What is chip_personality? What are the correct speedo_ids for each chip? How you set the correct speedo_ids, so nothing can override these clocks?

Hi dimtass,

We document chip_personality—this configures UCM1 vs UCM2 operation (4/4/16 vs 24x7).
The max frequencies are based upon chip ID, your observation is correct that some CD575M chips are not capable of running at the same clocks as CD580M.

So, setting personality is okay and we document how to do this, but users should not change any other settings in the kernel or could jeopardize the stability and reliability of the system.

Thanks

Hi kayccc

We are using a CD575-MI chip (SKU=0x80) and its clock is set to 204MHz during kernel boot instead of 696MHz with a CD575-M (SKU=0x87). How can we set the frequency to 696MHz and is this frequency compatible with a long term usage ?

Hi alejuventino

Customers should not be touching these settings.
This is to ensure that the chip operates correctly, modifying these settings may cause crashes/incorrect behavior. We even don’t know whether it’s compatible with a long term usage due as it’s not suggested.

Thanks

For CD575M, I had to change the settings in rev_sku_to_speedo_ids from:

case 0x1F:
	case 0x87:
	case 0x27:
	case 0x24:
		cpu_speedo_id = 5;
		soc_speedo_id = 0;
		gpu_speedo_id = 1;
		threshold_index = 0;
		if (sku == 0x87 && chip_personality == always_on) {
			cpu_speedo_id = 6;
			gpu_speedo_id = 4;
		}
		break;

to:

case 0x1F:
	case 0x81:
	case 0x87:
	case 0x27:
	case 0x24:
		cpu_speedo_id = 6;
		soc_speedo_id = 0;
		gpu_speedo_id = 5;
		threshold_index = 0;
		if (sku == 0x87 && chip_personality == always_on) {
			cpu_speedo_id = 6;
			gpu_speedo_id = 4;
		}
		break;

In the above case I also added CD580M (SKU: 0x81), so both jetson and custom board run on the same frequencies.

The reason is that around the 20% of the CD575M cpus just hanged when the gpu frequency was over 756MHz (this is the gpu_speedo_id). Also I’ve limited the CPU speed because as you can read in ‘Tegra_K1_Datasheet_DS67420001v02.pdf’ for CD575M and CD575MI, if you use the UCM:2 case (which means 1.9GHz for CPU and 800MHz for RAM) then the operating lifetime is 5years with 100% cpu load for CD575M and 10 years for CD575MI.

This means that if you care about the operating lifetime, then you also need to use the ‘PM375_Hynix_2GB_H5TC4G63AFR_RDA_792MHz.cfg’ configuration for the ram and not the 924MHz. Finally, by changing the speedos you just change the upper limits, but to control the frequencies within the kernel you need to use the following commands in a script or from the command line:

To see what are the possible rates:

# CPU
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq

# RAM
$ cat /sys/kernel/debug/clock/emc/possible_rates

# GPU
$ cat /sys/kernel/debug/clock/gbus/possible_rates

To see the current rate:

# CPU
$ cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq

# RAM
$ cat /sys/kernel/debug/clock/emc/rate

# GPU
$ cat /sys/kernel/debug/clock/gbus/rate

To set the rates:

# First ensure maximum CPU performance
echo 0 > /sys/devices/system/cpu/cpuquiet/tegra_cpuquiet/enable

# Enable all cpus
echo 1 > /sys/devices/system/cpu/cpu0/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

# GPU
echo 756000000 > /sys/kernel/debug/clock/override.gbus/rate
echo 1 > /sys/kernel/debug/clock/override.gbus/state

# RAM
echo 792000000 > /sys/kernel/debug/clock/override.emc/rate
echo 1 > /sys/kernel/debug/clock/override.emc/state

If for some reason your CPU rates are not set to the max that you’ve set with the speedo, then do:

# Limit CPU frequency to 1.9GHz
echo 1912500 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
echo 1912500 > /sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq
echo 1912500 > /sys/devices/system/cpu/cpu2/cpufreq/scaling_max_freq
echo 1912500 > /sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq

Use ‘performance’ in ‘scalling_governor’ only if you need your cpu/gpu to run at the max speedos you’ve set.

Hi,

We are using a CD575M-A1 custom board(SKU = 0x87). We are using 924MHz cfg for flashing.

When using chip_personality=0, it boots. But while using chip_personality=1 it stop booting in kernel.

We have observed that when chip_personality is 1 (always_on), core mv value is reduced to 1010. In tegra12_dvfs.c file in core_dvfs_table for CORE_DVFS emc value greater than 924MHz is at 6th index(1200000000) and core_millivolts[6] = 1100. Since at starting mv value is reduced to 1010 and here mv value is set to 1100, kernel stops booting specifying “tegra_dvfs: voltage 1100 too high for dvfs on emc”. Code flow is explained below.

From file arch/arm/mach-tegra/tegra12_speedo.c

static void rev_sku_to_speedo_ids(int rev, int sku)
{
	int can_boost = tegra_get_sku_override();
	int chip_personality = tegra_get_chip_personality();

	switch (sku) {

	case 0x1F:
	case 0x87:	//SKU value
	case 0x27:
	case 0x24:
		cpu_speedo_id = 5;
		soc_speedo_id = 0;
		gpu_speedo_id = 1;
		threshold_index = 0;
		if (sku == 0x87 && chip_personality == always_on) {
			cpu_speedo_id = 6;
			gpu_speedo_id = 4;
		}
		break;
	}
}

Obtained values of the below:
For chip_personality 0
CPU Speedo ID = 5
Soc Speedo ID = 0
Gpu Speedo ID = 1

For chip_personality 1
CPU Speedo ID = 6
Soc Speedo ID = 0
Gpu Speedo ID = 4

When checked code, found that in file arch/arm/mach-tegra/tegra12_speedo.c

int tegra_core_speedo_mv(void)
{
	int chip_personality = tegra_get_chip_personality();

	switch (soc_speedo_id) {
	case 0:
		if (chip_personality == always_on)
			return 1010;               // mv is reduced to 1010, when chip_personality is 1
		return 1150;
	case 1:
		return 1150;
	case 2:
		return 1110;
	case 3:
		return 1000;
	case 4:
		return 1100;
	default:
		BUG();
	}
}

when chip_personality is 1, reduces voltage to 1010.

In file arch/arm/mach-tegra/tegra12_dvfs.c

static struct dvfs core_dvfs_table[] = {
		/* Core voltages (mV):	 800,    850,    900,	 950,    1000,	1050,    1100,	 1110,    1150 */
		/* Clock limits for internal blocks, PLLs */
	
		CORE_DVFS("emc",        -1, -1, 1, KHZ, 264000, 348000, 384000, 384000, 792000, 792000, 1200000, 1200000, 1200000),
		...
	}
static int core_millivolts[MAX_DVFS_FREQS] = {
		800, 850, 900, 950, 1000, 1050, 1100, 1110, 1150};

Also in file arch/arm/mach-tegra/tegra12_dvfs.c

if (soc_speedo_id == 0 && chip_personality == always_on) {
		for (i = 0; i < MAX_DVFS_FREQS; i++) {
			if (core_millivolts[i] == 1000) {
				core_millivolts[i] = 1010;
				break;
			}
		}
	}

voltage level has been reduced in core_millivolts array.

When debugging, found that emc mv levels affects booting.

In file arch/arm/mach-tegra/dvfs.c

while (i < d->num_freqs && rate > freqs[i])
	i++;

mv = millivolts[i];
if ((d->max_millivolts) && (mv > d->max_millivolts)) {
	pr_warn("tegra_dvfs: voltage %d too high for dvfs on %s\n",
		mv, d->clk_name);
	return -EINVAL;
}

When i = 6, while loop breaks and stops booting with message “tegra_dvfs: voltage 1100 too high for dvfs on emc”.

Here,
rate = 924000000

millivolts[] = {800, 850, 900, 950, 1000, 1050, 1100, 1110, 1150};

freqs[] = {264000000, 348000000, 384000000, 384000000, 792000000, 792000000, 1200000000, 1200000000, 1200000000}; 

How we can boot 924MHz SOM, with chip_personality=1?

Thanks

Great Analysis!!

as you already know personalty 1 is a 24x7 profile that means we expect the board to be on 24x7.
In those environments, we can not boot the board @ 924Mhz EMC clock. you need to bring it down to 792Mhz.
In the jetson-tk1.conf file replace the 924 Mhz file with 724 and flash. let me know if you dont have the file. It should be there in the BSP

EMMC_BCT=PM375_Hynix_2GB_H5TC4G63AFR_H5TC4G63CFR_RDA_792MHz.cfg;

Hi komala.c,

I would suggest you to go with the 792MHz configuration and also reduce the CPU speed to ~1.9GHz. I’ve made several tests especially with running OpenGL on the DSI and HDMI outputs and there wasn’t any difference in the framerate compared to 2GHz and 925MHz.

Also, the most important thing is that you will extend your product’s life significantly and the processor will run cooler and more stable.

Thank you bbasu and dimtass!

I will try with 792MHz.cfg and will get back to you.