Gpu initialization failed on jetson nano powered with 24v to 5v DC convertor

when I run the following code.

import tensorflow as tf
import tensorflow.contrib.tensorrt as trt
tf_config = tf.ConfigProto()
tf_sess = tf.Session(config=tf_config)

i’m getting the below error:

"tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed"

log details:

>>> import tensorflow as tf
2020-02-08 19:53:20.459968: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
>>> import tensorflow.contrib.tensorrt as trt
2020-02-08 19:53:31.409071: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
>>> tf_config = tf.ConfigProto()
>>> tf_sess = tf.Session(config=tf_config)
2020-02-08 19:53:42.336160: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency
2020-02-08 19:53:42.336695: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x284fa1a0 executing computations on platform Host. Devices:
2020-02-08 19:53:42.336756: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2020-02-08 19:53:42.344951: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-02-08 19:53:42.418506: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2020-02-08 19:53:42.418851: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2900c0f0 executing computations on platform CUDA. Devices:
2020-02-08 19:53:42.418917: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): NVIDIA Tegra X1, Compute Capability 5.3
2020-02-08 19:53:42.419589: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2020-02-08 19:53:42.419730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
2020-02-08 19:53:42.419847: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-02-08 19:53:42.420010: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-02-08 19:53:42.420123: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-02-08 19:53:42.420229: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-02-08 19:53:42.424350: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-02-08 19:53:42.427404: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-02-08 19:53:42.427645: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-02-08 19:53:42.428008: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2020-02-08 19:53:42.428339: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2020-02-08 19:53:42.428434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1570, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 693, in __init__
    self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: unknown error

This code runs on power adapter plugged to an AC outlet. But when connected to an automotive battery it fails.

How can we make differentiate if its a power issue or a software issue?

Hi,

The error indicates that you cannot initiate the CUDA toolkit.
A common issue is the incompatible CUDA software.

Please noticed that you will need to install a TensorFlow built with the same JetPack as your setup.
Thanks.

@AastaLLL thanks for the reply. Without changing anything on the device, I ran the following code and it works.

import tensorflow as tf
tf_config = tf.ConfigProto()
tf_sess = tf.Session(config=tf_config)

I am running it on TF now. Does this mean I am able to initiate the CUDA toolkit?

Also, I have an exact replica( I burnt the same image) of the device that is plugged into an AC wall socket(220V to 5V 4A) and the original problem is not faced. How do I debug this further? I can give you log files if you want any.

Hi,

If your TensorFlow use GPU mode, then yes, you can initial CUDA toolkit now.
To check if TensorFlow run with GPU mode, you can run this for testing:

$ python3
>>> import tensorflow
... Successfully opened dynamic library <b>libcudart.so.10.0</b>

A common issue of CUDA initialization is the broken installation or incompatible software version.
This kind of issues can be fixed by reflashing and installing all the packages from the same JetPack.

Thanks.

@AastaLLL if you notice the output i have posted in the first thread, it has the “Successfully opened dynamic library libcudart.so.10.0”

>>> import tensorflow as tf
2020-02-08 19:53:20.459968: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
>>> import tensorflow.contrib.tensorrt as trt
2020-02-08 19:53:31.409071: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0

Bump

Hi,

Sorry for the delay.

A possible cause is that power starvation leads to GPU not working.
But it usually causes system shutdown or reboot, not the CUDA initialization failure.

Have you checked our power supply recommendation before:
https://devtalk.nvidia.com/default/topic/1048640/jetson-nano/power-supply-considerations-for-jetson-nano-developer-kit/

Thanks.