DirectShow not working consistently (VirtualDub, GraphStudioNext) after upgrade to 416.34 drivers

NOTE: I originally thought that the problem was with CUDA 9.1. Important new observations start as of #4, below.
Hello.

I am currently developing a CUDA 9.1 application on my Windows 10 (64 bit) ASUS Stratix (ROG) notebook with a GTX1060. This application works in tandem with a VR rig (Oculus Rift). Yesterday, the Oculus software required me to upgrade my GPU driver. I went to the NVIDIA site, got the latest driver (416.34) for my card – GTX1060 (Notebook). After installing the driver and rebooting, CUDA 9.1 has completely stopped working. The HWMonitor application reports 0% GPU utilization, and my program no longer outputs anything.

We have tried this on other machines. The driver update worked on an MSI laptop with a 1070, and on a generic desktop with a 1080. Thus I suspect the problem is related to my specific hardware combination.

Reinstalling CUDA 9.1 works, because it rolls back the drivers.
So I’m stuck: either I can’t use the Oculus software, or I can’t develop my own application.

I’m pretty new to all of this – I don’t know where to begin to start figuring out what the problem is. Any suggestions on where to start?
Does anyone else out there have a system similar to mine that works?

Thanks!

More information:

  • cudaSetDevice() returns success, as do all other calls.
  • cudaGetDeviceProperties() returns success, and it does succeed in getting the GPU's name (GeForce GTX 1060).

Nevertheless, the GPU activity is 0%. Investigation continues…

Results of deviceQuery below.
Is it a problem that the Driver and Runtime versions don’t match?

Device 0: "GeForce GTX 1060"
  CUDA Driver Version / Runtime Version          10.0 / 9.1
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 6144 MBytes (6442450944 bytes)
  (10) Multiprocessors, (128) CUDA Cores/MP:     1280 CUDA Cores
  GPU Max Clock rate:                            1671 MHz (1.67 GHz)
  Memory Clock rate:                             4004 Mhz
  Memory Bus Width:                              192-bit
  L2 Cache Size:                                 1572864 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 9.1, NumDevs = 1, Device0 = GeForce GTX 1060
Result = PASS

It’s not a problem that the driver and runtime versions do not match, and CUDA 9.1 should work with 416.34.

The fact that the deviceQuery app works along with the other non-error mentions you have made, suggests to me that CUDA (9.1) is actually working on your setup.

The problem appears to lie somewhere else.

After running some of the demo applications that came with CUDA, I started suspecting that CUDA was working fine. Normally, our program acts as a pseudo DirectShow video source, so I changed it to output to a window using OpenGL and discovered that the images were being generated correctly.

  • The 0% GPU utilization seems to be an incompatibility between the HWMonitor utility I am using and the latest driver.
  • The reason why I was not seeing images was that I was transmitting them to DirectShow (our application acts as a source), and trying to view them in VirtualDub64

The real problem seems to lie with certain specific programs that use DirectShow Video Sources.

  • VirtualDub 64 can no longer display from a Video Source. I just get a grey screen. I tried our program and various cameras connected to the computer -- none of the sources are working.
  • The same problem occurs with GraphEditNext, which is a DirectShow graph editor
  • WebRTC on Chrome seems to work: https://webrtc.github.io/samples/src/content/devices/input-output/
  • VLC seems to work, albeit slowly (but that's normal).
  • A custom visualization program that we created, which draws images from a DirectShow video source, works correctly.

Any idea what might be wrong here?

I’m starting to wonder if I should move this thread to another forum…

FYI - The issue with some directshow applications was investigated here:
https://sourceforge.net/p/vdfiltermod/tickets/186/

Looks like the applications that stopped working depend on some older technology that appears to be deprecated.

If your application appears under the following registry key, it may be explicitly disabled. I managed to get these applications to limp along be removing them from under this key, but it’s more of a hack than anything else:
HKEY_CURRENT_USER\Software\Microsoft\Direct3D\Shims\EnableOverlays