Hi guys,
Greeting from me!
On my Arch Linux server, The CUDA version is:
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
And gcc version is:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/8.2.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared --enable-threads=posix --enable-libmpx --with-system-zlib --with-isl --enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-plugin --enable-install-libiberty --with-linker-hash-style=gnu --enable-gnu-indirect-function --enable-multilib --disable-werror --enable-checking=release --enable-default-pie --enable-default-ssp --enable-cet=auto
Thread model: posix
gcc version 8.2.1 20180831 (GCC)
I have 2 projects which use CMake to control compilation flow. The first project generates dynamic libraries which are feeded into the second project. Now I just copy the dynmaic libraries & header files from the first project to the second, then build the second project. It works OK!
On my DGX-1 server, The CUDA version is:
$ /usr/local/cuda-9.0/bin/nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
And gcc version is:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 6.4.0-17ubuntu1~16.04' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --with-as=/usr/bin/x86_64-linux-gnu-as --with-ld=/usr/bin/x86_64-linux-gnu-ld --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.4.0 20180424 (Ubuntu 6.4.0-17ubuntu1~16.04)
Then copy libraries from first to second doesn’t work. The program will crash in cudart::globalState::registerEntryFunction:
(gdb) bt
#0 0x00007ffff73fc559 in cudart::globalState::registerEntryFunction(void**, char const*, char*, char const*, int, uint3*, uint3*, dim3*, dim3*, int*) () from /home/xiaonan/dl2-he/3rdparty/libDSI_FV.so
#1 0x00007ffff73decbc in __cudaRegisterFunction () from /home/xiaonan/dl2-he/3rdparty/libDSI_FV.so
#2 0x00007ffff73d9098 in __nv_cudaEntityRegisterCallback(void**) () from /home/xiaonan/dl2-he/3rdparty/libDSI_FV.so
#3 0x00000000004283d6 in __cudaRegisterLinkedBinary(__fatBinC_Wrapper_t const*, void (*)(void**), void*) ()
#4 0x00000000004282e5 in __cudaRegisterLinkedBinary_66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37 ()
#5 0x00007ffff7de76ba in ?? () from /lib64/ld-linux-x86-64.so.2
#6 0x00007ffff7de77cb in ?? () from /lib64/ld-linux-x86-64.so.2
#7 0x00007ffff7dd7c6a in ?? () from /lib64/ld-linux-x86-64.so.2
#8 0x0000000000000001 in ?? ()
#9 0x00007fffffffe7e3 in ?? ()
#10 0x0000000000000000 in ?? ()
I check the assembly code:
(gdb) disassemble
Dump of assembler code for function _ZN6cudart11globalState21registerEntryFunctionEPPvPKcPcS4_iP5uint3S7_P4dim3S9_Pi:
0x00007ffff73fc520 <+0>: mov %rbp,-0x20(%rsp)
0x00007ffff73fc525 <+5>: mov %r12,-0x18(%rsp)
0x00007ffff73fc52a <+10>: xor %eax,%eax
0x00007ffff73fc52c <+12>: mov %r13,-0x10(%rsp)
0x00007ffff73fc531 <+17>: mov %r14,-0x8(%rsp)
0x00007ffff73fc536 <+22>: mov %rcx,%r13
0x00007ffff73fc539 <+25>: mov %rbx,-0x28(%rsp)
0x00007ffff73fc53e <+30>: sub $0x38,%rsp
0x00007ffff73fc542 <+34>: mov (%rdi),%ecx
0x00007ffff73fc544 <+36>: mov %rdx,%r14
0x00007ffff73fc547 <+39>: mov %r8,%r12
0x00007ffff73fc54a <+42>: mov %r9d,%ebp
0x00007ffff73fc54d <+45>: mov 0x10(%rdi),%rdi
0x00007ffff73fc551 <+49>: test %ecx,%ecx
0x00007ffff73fc553 <+51>: jne 0x7ffff73fc5f0 <_ZN6cudart11globalState21registerEntryFunctionEPPvPKcPcS4_iP5uint3S7_P4dim3S9_Pi+208>
=> 0x00007ffff73fc559 <+57>: mov 0x10(%rax),%rbx
(gdb) i registers
rax 0x0 0
rbx 0xb734366d 3073652333
rcx 0x11 17
rdx 0x0 0
rsi 0x753170 7680368
rdi 0x7529a0 7678368
rbp 0xffffffff 0xffffffff
rsp 0x7fffffffe420 0x7fffffffe420
r8 0x0 0
r9 0x867de7ff 2256398335
r10 0x0 0
r11 0xa3f3365 171914085
r12 0x7ffff7435790 140737341773712
r13 0x7ffff7435790 140737341773712
r14 0x7ffff73d9da0 140737341398432
r15 0x7ffff73d9da0 140737341398432
rip 0x7ffff73fc559 0x7ffff73fc559 <cudart::globalState::registerEntryFunction(void**, char const*, char*, char const*, int, uint3*, uint3*, dim3*, dim3*, int*)+57>
eflags 0x10246 [ PF ZF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
The reason should be the first parameter of cudart::globalState::registerEntryFunction is 0. If I add the second project as a sub-directory of first project, the program runs fine. So I can’t figure out why copying dynamic libraries method doesn’t work on DGX-1. Because CUDA, gcc, or anything else?
Could someone give some clue? Thanks very much in advance!
Best Regards
Nan Xiao