root@bakunin /home/eyalroz # cat /etc/apt/sources.list.d/graphics-drivers-ppa-xenial.list
deb http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu xenial main
deb-src http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu xenial main
I only use the Intel on-board graphics for driving my display. Now, I can run CUDA code just fine, but if I try to debug anything (using nsight), I get the CUDBG_ERROR_ALL_DEVICES_WATCHDOGGED error. The contents of ‘xorg.conf’ is:
… but if I try to replace “intel” with “nvidia” as the Inactive screen, bad things happen (= Cinnamon starts in fallback mode). If I remove the nVIDIA entries altogether, the file gets magically rewritten when I log out and log in again.
Why is this happening? And what can I do to be able to debug in peace?
Can you tell me which cuda toolkit are your using?
Can you help to build the sdk samples under /usr/local/cuda/samples/1_Utilities/deviceQuery, and then run deviceQuery and paste the output?
Thanks!
ps: Is the app you want to debug with a GUI display ?
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 650 Ti BOOST"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 1999 MBytes (2095775744 bytes)
( 4) Multiprocessors, (192) CUDA Cores/MP: 768 CUDA Cores
GPU Clock rate: 1058 MHz (1.06 GHz)
Memory Clock rate: 3004 Mhz
Memory Bus Width: 192-bit
L2 Cache Size: 393216 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 2 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 650 Ti BOOST
Result = PASS
And - the app I was debugging does not involve any GUI.
Thanks for the info!
As far as I know, CUDBG_ERROR_ALL_DEVICES_WATCHDOGGED error code will reported when the GPU is also used for display.
And I can reproduce this if I use my local gk106 to connect display and then try to debugger on it.
Based on the info you give below, seems can explains this.
Display Server: X.Org 1.18.4 driver: nvidia Resolution: 1920x1080@60.00hz
GLX Renderer: GeForce GTX 650 Ti BOOST/PCIe/SSE2 GLX Version: 4.5.0 NVIDIA 375.26
Also as gk106 do not support software preemption feature, so you can’t software debugging on it.
As a conclusion, if you want to debug on your system, maybe you can kill the X server and then do debugging from cuda-gdb command line.
You can execute nvidia-smi to check if there is X server running on nvidia gpu.
If yes, that can explains your problem.
Actually, the use of Intel and Nvidia GPU together will involve many problems, and you must do correct configurations to implement that. Suppose there should be multi material you can refer by search ‘google’. I never tried this before, so have little to say about this.
In our env, we usually disable Intel in bios and just use nvidia gpu.
veraj: How do I invoke nvidia-smi to make that check? Also, how can an X server run on a GPU if the physical monitor is not connected to that GPU?
Now, while it’s true that I have Intel graphics and an nVIDIA GPU together on the same system, I am not actually using them “together” - I’ve done nothing to link them in any way. And, after all, every PC system with an Intel rather than an AMD CPU now has some kind of Intel graphics controller, so it’s not clear to me how using an nVIDIA GPU in what is perhaps the most common configuration should involve many problems…
Double-checked. My monitor is connected to the Motherboard’s DVI port, not the nVIDIA card’s. You can have a peek at my Xorg.0.log though. It’s full of copies of the following:
I used to be able to debug apps with this exact same setup with my previous Linux distribution installation (I was using Kubuntu 16.04 with lightdm and lxdm).
Re nvidia-smi: It works, but it’s not clear what you suggest I do with it. Just running it yields: