Getting VK_ERROR_INCOMPATIBLE_DRIVER

Hi,

I’m trying to render offscreen on a headless server (no X installed) with 4x Titan X but I can’t get passed vkCreateInstance. Is there something that I could do to debug this?

Details:

eric@quark:~$ echo $LD_LIBRARY_PATH
/home/eric/VulkanSDK/1.0.30.0/x86_64/lib:/usr/lib/nvidia-370:
eric@quark:~$ nvidia-smi
Fri Nov 4 16:59:40 2016
±----------------------------------------------------------------------------+
| NVIDIA-SMI 370.28 Driver Version: 370.28 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN X (Pascal) On | 0000:02:00.0 Off | N/A |
| 23% 32C P8 16W / 250W | 0MiB / 12189MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 1 TITAN X (Pascal) On | 0000:03:00.0 Off | N/A |
| 23% 36C P8 15W / 250W | 0MiB / 12189MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 2 TITAN X (Pascal) On | 0000:82:00.0 Off | N/A |
| 23% 36C P8 17W / 250W | 0MiB / 12189MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 3 TITAN X (Pascal) On | 0000:83:00.0 Off | N/A |
| 23% 35C P8 15W / 250W | 0MiB / 12189MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+
eric@quark:~$ ~/VulkanSDK/1.0.30.0/x86_64/bin/vulkaninfo

VULKAN INFO

Vulkan API Version: 1.0.30

INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_vktrace_layer.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_threading.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_unique_objects.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_core_validation.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_screenshot.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_swapchain.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_object_tracker.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_api_dump.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_image.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /home/eric/VulkanSDK/1.0.30.0/x86_64/etc/explicit_layer.d/VkLayer_parameter_validation.json, version “1.0.0”
INFO: [loader] Code 0 : Found manifest file /usr/share/vulkan/icd.d/nvidia_icd.json, version “1.0.0”
ERROR: [loader] Code 0 : Couldn’t get vkCreateInstance via vk_icdGetInstanceProcAddr for ICD libGLX_nvidia.so.0
Cannot create Vulkan instance.
/var/lib/jenkins/workspace/Create-Linux-VulkanSDK/Vulkan-LoaderAndValidationLayers/demos/vulkaninfo.c:680: failed with VK_ERROR_INCOMPATIBLE_DRIVER

The same here. Trying to run Vulkan on NVIDIA GRID K520 and it’s marked as conformant:

$ nvidia-smi -q | head

==============NVSMI LOG==============

Timestamp                           : Mon Nov  7 19:06:18 2016
Driver Version                      : 370.28

Attached GPUs                       : 1
GPU 0000:00:03.0
    Product Name                    : GRID K520
    Product Brand                   : Grid
$ vulkaninfo
===========
VULKAN INFO
===========

Vulkan API Version: 1.0.32

INFO: [loader] Code 0 : Found manifest file /usr/local/etc/vulkan/explicit_layer.d/VkLayer_swapchain.json, version "1.0.0"
INFO: [loader] Code 0 : Found manifest file /usr/local/etc/vulkan/explicit_layer.d/VkLayer_parameter_validation.json, version "1.0.0"
INFO: [loader] Code 0 : Found manifest file /usr/local/etc/vulkan/explicit_layer.d/VkLayer_image.json, version "1.0.0"
INFO: [loader] Code 0 : Found manifest file /usr/local/etc/vulkan/explicit_layer.d/VkLayer_threading.json, version "1.0.0"
INFO: [loader] Code 0 : Found manifest file /usr/local/etc/vulkan/explicit_layer.d/VkLayer_object_tracker.json, version "1.0.0"
INFO: [loader] Code 0 : Found manifest file /usr/local/etc/vulkan/explicit_layer.d/VkLayer_unique_objects.json, version "1.0.0"
INFO: [loader] Code 0 : Found manifest file /usr/local/etc/vulkan/explicit_layer.d/VkLayer_core_validation.json, version "1.0.0"
INFO: [loader] Code 0 : Found manifest file /etc/vulkan/icd.d/nvidia_icd.json, version "1.0.0"
ERROR: [loader] Code 0 : Couldn't get vkCreateInstance via vk_icdGetInstanceProcAddr for ICD libGLX_nvidia.so.0
Cannot create Vulkan instance.
/home/ec2-user/Vulkan-LoaderAndValidationLayers/demos/vulkaninfo.c:681: failed with VK_ERROR_INCOMPATIBLE_DRIVER

The same for Tesla K80:

$ nvidia-smi -q | head


==============NVSMI LOG==============

Timestamp                           : Mon Nov  7 23:12:12 2016
Driver Version                      : 367.55

Attached GPUs                       : 1
GPU 0000:00:1E.0
    Product Name                    : Tesla K80
    Product Brand                   : Tesla
...
/home/ec2-user/Vulkan-LoaderAndValidationLayers/demos/vulkaninfo.c:681: failed with VK_ERROR_INCOMPATIBLE_DRIVER

Figured that out, you’ll actually need to run Xorg or any other suitable renderer backend. It cannot create a Vulkan instance because fails to connect to X. Exporting DISPLAY=:0 is required too. -_____-

$ DISPLAY=:0 ./vulkaninfo
===========
VULKAN INFO
===========

Vulkan API Version: 1.0.24

INFO: [loader] Code 0 : Found manifest file /etc/vulkan/icd.d/nvidia_icd.json, version "1.0.0"

Instance Extensions:
====================
Instance Extensions	count = 5
	VK_KHR_surface                      : extension revision 25
	VK_KHR_xcb_surface                  : extension revision  6
	VK_KHR_xlib_surface                 : extension revision  6
	VK_EXT_debug_report                 : extension revision  4
	VK_NV_external_memory_capabilities  : extension revision  1
Layers: count = 0
=======
Presentable Surface formats:
============================
None found

OK got it. It’s a bit unfortunate that you have to install X on a headless server when you want to do offline rendering :(

Anyway, thanks a lot for the follow-up!

Right, so we’ve established that you need X11 with the current implementation, but is that something they are going to fix?

Do we need to file a bug report?

How do you file a bug report to nVidia?

I went here: http://www.nvidia.com/object/driverqualityassurance.html and clicked Provide Driver Feedback and got Fatal Error Access Denied Reason: Client address is not authorized. (x.x.x.x). Lol.

I don’t know any official channel. It would be nice if someone official could come and help.

I also asked this here: https://www.reddit.com/r/vulkan/comments/5c5bm9/nvidia_vulkan_without_x11/

Hi,

Sorry for the late reply. Our next driver release will support direct to display which, even if you are not interested in presenting your frames, will allow you to use Vulkan without X running.

We tried to use the latest driver on Amazon EC2 GPU instances, which have GRID K520, and got the following:

WARNING: The NVIDIA GRID K520 GPU installed in this system is supported
through the NVIDIA 367.xx legacy Linux graphics drivers.  Please
visit http://www.nvidia.com/object/unix.html for more information.
The 375.26 NVIDIA Linux graphics driver will ignore this GPU.

So are we stuck using X11 with this hardware? Disappointing.

I tried the latest drivers (375.26) on my desktop which has a GTX770, and it still fails unless running within X11.

Hi,

You need drivers from the 378 series, available in beta version here:

https://devtalk.nvidia.com/default/topic/988593/unix-graphics-announcements-and-news/linux-solaris-and-freebsd-driver-378-09-beta-/

Which driver should we use for a P100 on a headless RHEL7 x64 server? The most recent I can find is 375.66.

Thanks!

Is this a Tesla board or a GRID board? Where did you get 375.66 from?

Sorry for the late reply. Here is some more info (pretty sure these are Teslas?).

lspci -k | grep -A2 P100
03:00.0 3D controller: NVIDIA Corporation GP100GL (rev a1)
	Subsystem: NVIDIA Corporation Device 118f
	Kernel driver in use: nvidia
82:00.0 3D controller: NVIDIA Corporation GP100GL (rev a1)
	Subsystem: NVIDIA Corporation Device 118f
	Kernel driver in use: nvidia
nvidia-smi
Tue Jul 11 19:53:24 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26                 Driver Version: 375.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 0000:03:00.0     Off |                    0 |
| N/A   34C    P0    38W / 250W |    305MiB / 16276MiB |     64%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE...  Off  | 0000:82:00.0     Off |                    0 |
| N/A   33C    P0    40W / 250W |    305MiB / 16276MiB |     88%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0    177361    C   ./isng.e                                       305MiB |
|    1    177372    C   ./isng.e                                       305MiB |
+-----------------------------------------------------------------------------+

I mixed up the driver versions it seems (this makes more sense): I have a PC at home I’m using as well, with a GTX 980 Ti which is on the XXX.66 I believe.

Then you should be fine with our latest driver package.

381.22 available here: Unix Drivers | NVIDIA

Although we do recommend installing our drivers through your Linux distribution.

The supported hardware list doesn’t include the P100, or any Tesla products. Will these work? Thanks!

For 381.22 here: Linux x64 (AMD64/EM64T) Display Driver | 381.22 | Linux 64-bit | NVIDIA

Click on “Additional Information”, then “README”, then the Appendix A, the board is listed there.