And when I run deviceQuery program, I got an error :
./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 10
-> invalid device ordinal
Result = FAIL
It is the same with nvidia-smi program :
nvidia-smi
Unable to determine the device handle for GPU 0000:00:08.0: Unknown Error
I see that in syslog :
Feb 16 10:56:27 wstest-gpu3 systemd[1]: Starting NVIDIA Persistence Daemon...
Feb 16 10:56:27 wstest-gpu3 systemd[1]: Started NVIDIA Persistence Daemon.
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: Verbose syslog connection opened
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: Now running with user ID 116 and group ID 126
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: Started (6115)
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: device 0000:00:08.0 - registered
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: Local RPC service initialized
Feb 16 10:56:27 wstest-gpu3 systemd[1]: Stopping NVIDIA Persistence Daemon...
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: Received signal 15
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: Socket closed.
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: PID file unlocked.
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: PID file closed.
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: The daemon no longer has permission to remove its runtime data directory /var/run/nvidia-persistenced
Feb 16 10:56:27 wstest-gpu3 nvidia-persistenced: Shutdown (6115)
Feb 16 10:56:27 wstest-gpu3 systemd[1]: Stopped NVIDIA Persistence Daemon.
Feb 16 10:56:27 wstest-gpu3 systemd[1]: Starting NVIDIA Persistence Daemon...
Feb 16 10:56:27 wstest-gpu3 systemd[1]: Stopped NVIDIA Persistence Daemon.
Can you help me to resolve this problem ?
I do the same thing on another machine with a Tesla M40 without problem.
nvidia-smi
Fri Feb 17 08:06:18 2017
±----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39 Driver Version: 375.39 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN X (Pascal) Off | 0000:00:08.0 Off | N/A |
| 0% 27C P0 51W / 250W | 0MiB / 12189MiB | 0% Default |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+