Nvidia-container-runtime on Jetson Nano using Yocto

At this time, there is no Yocto recipe for bitbaking nvidia-container-runtime into a Yocto build. I’ve tried installing the .deb packages from the NVIDIA SDK-manager but this is apparently not enough to get the GPU access through to the containers, as running any CUDA-10 example in a docker container running nvidia-container-runtime will result in an error stating that no CUDA ready devices were found. Running an example outside of the container runs without problems.

Looking in the rootfs the default Nvidia Ubuntu that is created from the SDK-manager, i can see there is a lot of libraries and configurations that is not available in Yocto. Specifically I’ve come across the csv files in /etc/nvidia-container-runtime/host-files-for-container.d/. This looks promising as they list a ton of stuff that are merged into a container that is run with nvidia-container-runtime.

Only the l4t.csv file contains several “not found” files. As many as possible were installed through Yocto recipes, but a lot of them are still lacking and not supported in Yocto. Instead, the missing files that could not be provided from Yocto were removed from the list, one by one. Eventually though, the error received when trying to run the docker with nvidia-container-runtime is:

$ docker run -it --rm --runtime nvidia --gpus all nvcr.io/nvidia/l4t-base:r32.3.1
docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"process_linux.go:413: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: src: /usr/lib/libcudnn.so.7, src_lnk: libcudnn.so.7.6.3, dst: /var/lib/docker/overlay2/9b07e768b59dce6c3cd1b3b94a8019978b9bc24d84511bd37d82679efc94b829/merged/usr/lib/libcudnn.so.7, dst_lnk: libcudnn.so.7.6.3\\\\nsrc: /usr/lib/libnvcaffe_parser.so.6, src_lnk: libnvparsers.so.6.0.1, dst: /var/lib/docker/overlay2/9b07e768b59dce6c3cd1b3b94a8019978b9bc24d84511bd37d82679efc94b829/merged/usr/lib/libnvcaffe_parser.so.6, dst_lnk: libnvparsers.so.6.0.1\\\\nsrc: /usr/lib/libnvcaffe_parser.so.6.0.1, src_lnk: libnvparsers.so.6.0.1, dst: /var/lib/docker/overlay2/9b07e768b59dce6c3cd1b3b94a8019978b9bc24d84511bd37d82679efc94b829/merged/usr/lib/libnvcaffe_parser.so.6.0.1, dst_lnk: libnvparsers.so.6.0.1\\\\nsrc: /usr/lib/libnvinfer.so.6, src_lnk: libnvinfer.so.6.0.1, dst: /var/lib/docker/overlay2/9b07e768b59dce6c3cd1b3b94a8019978b9bc24d84511bd37d82679efc94b829/merged/usr/lib/libnvinfer.so.6, dst_lnk: libnvinfer.so.6.0.1\\\\nsrc: /usr/lib/libnvinfer_plugin.so.6, src_lnk: libnvinfer_plugin.so.6.0.1, dst: /var/lib/docker/overlay2/9b07e768b59dce6c3cd1b3b94a8019978b9bc24d84511bd37d82679efc94b829/merged/usr/lib/libnvinfer_plugin.so.6, dst_lnk: libnvinfer_plugin.so.6.0.1\\\\nsrc: /usr/lib/libnvonnxparser.so.6, src_lnk: libnvonnxparser.so.6.0.1, dst: /var/lib/docker/overlay2/9b07e768b59dce6c3cd1b3b94a8019978b9bc24d84511bd37d82679efc94b829/merged/usr/lib/libnvonnxparser.so.6, dst_lnk: libnvonnxparser.so.6.0.1\\\\nsrc: /usr/lib/libnvonnxparser_runtime.so.6, src_lnk: libnvonnxparser_runtime.so.6.0.1, dst: /var/lib/docker/overlay2/9b07e768b59dce6c3cd1b3b94a8019978b9bc24d84511bd37d82679efc94b829/merged/usr/lib/libnvonnxparser_runtime.so.6, dst_lnk: libnvonnxparser_runtime.so.6.0.1\\\\nsrc: /usr/lib/libnvparsers.so.6, src_lnk: libnvparsers.so.6.0.1, dst: /var/lib/docker/overlay2/9b07e768b59dce6c3cd1b3b94a8019978b9bc24d84511bd37d82679efc94b829/merged/usr/lib/libnvparsers.so.6, dst_lnk: libnvparsers.so.6.0.1\\\\n, stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --compat32 --graphics --utility --video --display --pid=6358 /var/lib/docker/overlay2/9b07e768b59dce6c3cd1b3b94a8019978b9bc24d84511bd37d82679efc94b829/merged]\\\\nnvidia-container-cli: mount error: (null)\\\\n\\\"\"": unknown.
ERRO[0003] error waiting for container: context cancelled

“nvidia-container-cli: mount error: (null)” is an error i cannot seem to figure out how to solve.

Below are information from the build

OS INFO:
Linux jetson-nano 4.9.140-l4t-r32.3.1+g47e7e1c #1 SMP PREEMPT Mon Jan 20 08:52:22 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

YOCTO INFO:
Yocto/poky zeus r32.3.1 (JetPack 4.3)

CUDA INFO:

$ dpkg --get-selections | grep cuda
ii  cuda-command-line-tools               10.0.326-1-r0 arm64        cuda-command-line-tools version 10.0.326-1-r0
ii  cuda-command-line-tools-libnvtoolsext 10.0.326-1-r0 arm64        cuda-command-line-tools version 10.0.326-1-r0
ii  cuda-core                             10.0.326-1-r0 arm64        cuda-core version 10.0.326-1-r0
ii  cuda-cublas                           10.0.326-1-r0 arm64        cuda-cublas version 10.0.326-1-r0
ii  cuda-cudart                           10.0.326-1-r0 arm64        cuda-cudart version 10.0.326-1-r0
ii  cuda-cufft                            10.0.326-1-r0 arm64        cuda-cufft version 10.0.326-1-r0
ii  cuda-curand                           10.0.326-1-r0 arm64        cuda-curand version 10.0.326-1-r0
ii  cuda-cusolver                         10.0.326-1-r0 arm64        cuda-cusolver version 10.0.326-1-r0
ii  cuda-cusparse                         10.0.326-1-r0 arm64        cuda-cusparse version 10.0.326-1-r0
ii  cuda-driver                           10.0.326-1-r0 arm64        cuda-driver version 10.0.326-1-r0
ii  cuda-misc-headers                     10.0.326-1-r0 arm64        cuda-misc-headers version 10.0.326-1-r0
ii  cuda-npp                              10.0.326-1-r0 arm64        cuda-npp version 10.0.326-1-r0
ii  cuda-nvrtc                            10.0.326-1-r0 arm64        cuda-nvrtc version 10.0.326-1-r0
ii  cuda-toolkit                          10.0.326-1-r0 arm64        cuda-toolkit version 10.0.326-1-r0
$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA Tegra X1"
  CUDA Driver Version / Runtime Version          10.0 / 10.0
  CUDA Capability Major/Minor version number:    5.3
  Total amount of global memory:                 3964 MBytes (4156665856 bytes)
  ( 1) Multiprocessors, (128) CUDA Cores/MP:     128 CUDA Cores
  GPU Max Clock rate:                            922 MHz (0.92 GHz)
  Memory Clock rate:                             1600 Mhz
  Memory Bus Width:                              64-bit
  L2 Cache Size:                                 262144 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            No
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS

DOCKER INFO:

$ docker info
Client:
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 19.03.2-ce
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: nvidia runc
 Default Runtime: nvidia
 Init Binary: docker-init
 containerd version: fd103cb716352c7e19768e4fed057f71d68902a0.m
 runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f-dirty
 init version: fec3683-dirty (expected: fec3683b971d9)
 Kernel Version: 4.9.140-l4t-r32.3.1+g47e7e1c
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 3.871GiB
 Name: jetson-nano-prod
 ID: 4SRR:RDMN:MNPM:VFZW:EVPV:FIXW:4WNG:HH3D:7RVS:AVLF:JBLW:HRYH
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  localhost:5000
  127.0.0.0/8
 Registry Mirrors:
  http://localhost:5000/
 Live Restore Enabled: false

NVIDIA-CONTAINER-RUNTIME INFO:

ii  libnvidia-container-tools     0.9.0~beta.1 arm64        NVIDIA container runtime library (command-line tools)
ii  libnvidia-container0:arm64    0.9.0~beta.1 arm64        NVIDIA container runtime library
ii  libnvidia-egl-wayland1        1.1.3-r0     arm64        egl-wayland version 1.1.3-r0
ii  nvidia-container-runtime      3.1.0-1      arm64        NVIDIA container runtime
un  nvidia-container-runtime-hook <none>       <none>       (no description available)
ii  nvidia-container-toolkit      1.0.1-1      arm64        NVIDIA container runtime hook
ii  xserver-xorg-video-nvidia     32.3.1-r0    arm64        xserver-xorg-video-nvidia version 32.3.1-r0

I am stuck in getting this to work as I’ve tried almost anything I can think of. Does anyone know anything about getting Nvidia-container-runtime to work with Yocto?

Hi,
[s]For running Yocto on Jetson Nano, we would suggest use DeepStream SDK. You can install the package through SDKManager. The latest release is DS4.0.2:
https://devtalk.nvidia.com/default/topic/1068639/deepstream-sdk/announcing-deepstream-sdk-4-0-2/

After installation, you will see the sample in

deepstream_sdk_v4.0.2_jetson\sources\objectDetector_Yolo

Please follow README to give it a run.[/s]
Sorry I messed up Yolo with Yocto. By default we integrate with Ubuntu OS and do not have experience of using Yocto. Would need other users to share experience.

Hi,

Based on this command, it looks like that running ngc docker container on Yocto is not yet supported.
https://forums.balena.io/t/deploying-nvidia-ngc-docker-containers/23044/10

Sorry that we don’t have too much experience on Yocto.
Here is another topic who is also working with Yocto system.
Maybe you can give some information from him.
https://devtalk.nvidia.com/default/topic/1056507/jetson-tx2/comparing-filesystem-with-different-packages-filesystem-without-tensorrt-/

Thanks.

Hi,
Thanks for trying to help.

The maintainer of the meta-tegra layer (BSP for Jetson Boards), a MSc student of Computer Engineering and myself came up with some Yocto recipes for the nvidia-container-runtime based off the versions and content of certain debain packages that comes with downloading the JetPack 4.3 via the NVIDIA SDK-manager.
This has been tested successfully on both Jetson Nano DevKit and Jetson Nano Production Module but should work just fine on the other Jetson boards that the meta-tegra layer has support for. It is being merged into the meta-tegra layer github. Using these recipes allows access to CUDA and GPU acceleration inside docker containers on the jetson boards.

The layer can be found [here][GitHub - OE4T/meta-tegra: BSP layer for NVIDIA Jetson platforms, based on L4T], and the issue and progress of creating the recipes can be found in this [issue][https://github.com/madisongh/meta-tegra/issues/230]