I am running a set of experiments where I am controlling the different core frequencies on the Tegra K1 (for example for GPU, CPU complexes and EMC (RAM)) to model power usage. To do this, I execute the following commands:
After that, I run some benchmarks that quickly make the platform hang up completely. It needs to be restarted by pushing the reset button manually (there is no way to use it in this state, not directly with keyboard and display nor over SSH).
The benchmarks are simple CUDA-accelerated C-programs that reads video files from a ram filesystem, processes them and writes the output to /dev/null.
I did some further experiments and this issue seems to occur when the system is operating a lower EMC clock speeds. Starting from 396 MHz, I worked my way down the following frequencies:
At 20.4 MHz the system became unresponsive upon TCP transmission (I transmit some power logs from an external logger machine to the Tegra over TCP when the benchmark is complete).
Could you use UART console to check if there are any timeout message such as:
BUG: soft lockup - CPU#0 stuck for 22s!
This message is print by watchdog which is to monitor if the lowest priority can be served in 22 second.
In low EMC frequency case, it is possible that high memory operation process occupies the resource more than 22 sec, and it may explain why TCP becomes unresponsive.