CUDA_VISIBLE_DEVICES being ignored

I am using CUDA_VISIBLE_DEVICES = 0 but yet the process ends up using GPU 2 instead of GPU 0. So now i have 2 processes on GPU 2 as shown below. Note this seems to work randomly sometimes. Any clues or hints would be appreciated.

±-----------------------------------------------------+
| NVIDIA-SMI 346.72 Driver Version: 346.72 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT… Off | 0000:01:00.0 On | N/A |
| 22% 58C P8 18W / 250W | 696MiB / 12287MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 1 GeForce GTX TIT… Off | 0000:02:00.0 Off | N/A |
| 47% 82C P2 128W / 250W | 4374MiB / 12287MiB | 55% Default |
±------------------------------±---------------------±---------------------+
| 2 GeForce GTX TIT… Off | 0000:03:00.0 Off | N/A |
| 30% 70C P2 113W / 250W | 3448MiB / 12287MiB | 30% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1597 G /usr/bin/X 497MiB |
| 0 3012 G compiz 167MiB |
| 0 3269 G /usr/lib/firefox/firefox 3MiB |
| 1 7799 C python 4347MiB |
| 2 5718 C python 2101MiB |
| 2 12223 C python 1318MiB |
±----------------------------------------------------------------------------+

Does this behaviour still happen in the newer driver versions from R352 or R361 driver family?

Furthermore, does “CUDA_VISIBLE_DEVICES” work with CUDA Samples. i.e. ~/NVIDIA_CUDA-7.5_Samples/1_Utilities/deviceQuery

note that the order of 0,1,2 for CUDA_VISIBLE_DEVICES does not necessarily correspond to the order 0,1,2 for nvidia-smi.

It’s quite possible that the device enumerated as “2” in nvidia-smi is actually the device enumerated as “0” for CUDA_VISIBLE_DEVICES

I am using CUDA 7.0 and I observed if i make CUDA_VISIBLE_DEVICES only “0”, then I cannot start sample programs, even though #0 is sthere according to nvidia-smi and is runing my X windows.
I am resisting upgrading to 7.5 if there is no certainty that it could solve the problem.

But I have seen posts that say they should correspond one to one. Even if they don’t, shouldn’t they be consistent? otherwise what is the point of CUDA_VISIBLE_DEVICES if it is not honoured? How how can one figure out the mapping between what nvidia-smi reports and CUDA_VISIBLE_DEVICES?

I have not tried drivers R352 or R361. I thought the 346.72 is the one recommended for Titan X and the most stable on Ubuntu 14.04.

Yes, there should be a one-to-one correspondence or mapping. (assuming you don’t make a system configuration change)

You haven’t provided the sequence of commands that you are issuing or a great many other details, so I was just pointing this out in case you didn’t already know it, and were expecting that a process launched with

CUDA_VISIBLE_DEVICES=“0” ./my_task

would always end up on the device enumerated as zero by nvidia-smi

That is not guaranteed to be the case. But if you launch such a process, and it ends up on device 2 (as reported by nvidia-smi) then future commands of the form:

CUDA_VISIBLE_DEVICES=“0” ./my_other_task

should also end up on (nvidia-smi) device 2.

Right, what I am doing is:

setenv CUDA_VISIBLE_DEVICES “0”

and then launching my program.

I just noticed that if I do the setenv above, the nvidia-smi shows the process running on GPU # 2. Whereas if I do:

setenv CUDA_VISIBLE_DEVICES “2”

and launch the program, it ends up on GPU #0.

If it is reversed, it is easy to work around, but if random (and cannot test for few more hours as my programs are running), that’s a headache…

It’s not random.
Nor is it always guaranteed to be reversed. It is SYSTEM SPECIFIC.

In a given system, if you don’t make any configuration changes (changing the motherboard, changing the BIOS, changing slots that cards are installed in, changing the OS, adding other PCIE devices, etc.) then there will be a fixed mapping from CUDA device enumeration to nvidia-smi device enumeration. But this is not guaranteed to be the mapping:

0:0
1:1
2:2

It might be:

2:0
1:1
0:2

It might also be:

1:0
2:1
0:2

Or any other arrangement, that involves a 1:1 mapping.

The mapping is indeed reversed on one machine, and not on the other. Have installed both the same, though they are not identical machines… Thanks!