[SOLVED] NSight 5.0 Graphics Debugger GL

SOLUTION!!!
OK, I solved it. The reason why it does not display anything is that I switched my app to a 64-bit architecture a while back. Coincidentally, NOWHERE in the Nsight requirements is it mentioned that only 32-bit apps are supported. And since 5.0 is not supported on 32-bit operating systems, I automatically presumed it supports 64-bit apps (maybe only). To the Nsight team: If this is true, please update information accordingly.

I did a quick recompile, and a very simple WinAPI 64-bit app is not showing the in-window Nsight HUD. A WinAPI 32-bit app does work OK. So it is not related only to SDL, it appears the whole 64-bit architecture is a problem for Nsight.

So, switching back to 32-bit solved it for WinAPI as well as SDL, everything works OK now.

Original question:
Hi, I am pretty desperate, I cannot get the graphics debugger to start at all. I’ve tried everything apart from downgrading to Windows 8.

I have triple-checked if I meet all requirements for NSight. Here’s my configuration:
Intel Core i7-3770K, 16 GB RAM, ASUS P8Z77 Pro
Gigabyte GeForce GTX 680, 2GB RAM, driver version 358.50 (newest)
Win 10 Pro, fully updated, Visual Studio 2015 Community, Nsight 5.0.0.15294 (newest)

The requirements page is confusing, since in the table you mention that Graphics Shader Debugger is available on a single GPU system, but the bullet list below suggests that on a single GPU, only CUDA debugging is supported. Please explain what is actually correct. I am referring to this link: Nsight Visual Studio Edition Requirements | NVIDIA Developer

Should I connect to the machine remotely, since it only has one GPU? Or should I enable the built-in Intel GPU (and connect my monitor to it) on the processor and use the NVIDIA GPU as a compute-only GPU. Does that even work with Graphics debugging, where you require it to output to a window?

On the same machine, with an older Nsight (I do not remember the version) and older system (Win 8), I was debugging Graphics just fine (it’s been over a year ago, maybe more). It was after seeing this video:

So I guess Visual Studio 2012 + Nsight 3.2 + Win 8 worked flawlessly. The HUD showed up after app start, and the Frame Debugger and Shader Debugger worked.

I am using SDL + OpenGL + GLSL + GLEW. There are a few GLSL compute shaders in my project, could these be causing the problem? I remember way back that Nsight showed a warning when I was using a feature that was not supported. Nothing like this happens now. I was using WinAPI before for window creation instead of SDL. Could this be the problem?

Starting Graphics Debugging shows no error, only displays the classic warning about TDR being only 2 seconds. And it does not complain about not being able to connect, so the Nsight Monitor is definitely getting the data.

Starting an OpenGL app that just clears the framebuffer does not work as well.

Please help me. Did I miss some configuration? I left all settings on default. My work goes much slower when I cannot do shader debugging and this would help me spare countless hours. Thank you.

Here’s the output of System info generated by Nsight from Visual Studio:

Report Information
        UnixTime Generated                                                      1446559935
    OS Information
        Computer Name                                                           Corsair
        NetBIOS Name                                                            CORSAIR
        OS Name                                                                 Windows 10 Pro
        GetVersionEx
            dwMajorVersion                                                      10
            dwMinorVersion                                                      0
            dwBuildNumber                                                       10240
            dwPlatformId                                                        2
            wServicePackMajor                                                   0
            wServicePackMinor                                                   0
            wSuiteMask                                                          256
            wProductType                                                        Workstation
        GetProductInfo                                                          48
        GetNativeSystemInfo
            wProcessorArchitecture                                              x64
            dwPageSize                                                          4096
            lpMinimumApplicationAddress                                         65536
            lpMaximumApplicationAddress                                         140737488289791
            dwActiveProcessorMask                                               255
            dwNumberOfProcessors                                                8
            dwAllocationGranularity                                             65536
            wProcessorLevel                                                     6
            wProcessorRevision                                                  14857
        EnumDisplayDevices
            Display Device
                DeviceName                                                      \\.\DISPLAY1
                DeviceString                                                    NVIDIA GeForce GTX 680
                StateFlags                                                      524293
                DeviceID                                                        PCI\VEN_10DE&DEV_1180&SUBSYS_353C1458&REV_A1
                DeviceKey                                                       \Registry\Machine\System\CurrentControlSet\Control\Video\{B2084E8B-2720-449A-AED9-E38B350A4ECC}\0000
                Monitor
                    DeviceName                                                  \\.\DISPLAY1\Monitor0
                    DeviceString                                                SyncMaster 2243NW/2243NWX
                    StateFlags                                                  3
                    DeviceID                                                    MONITOR\SAM03C2\{4d36e96e-e325-11ce-bfc1-08002be10318}\0002
                    DeviceKey                                                   \Registry\Machine\System\CurrentControlSet\Control\Class\{4d36e96e-e325-11ce-bfc1-08002be10318}\0002
            Display Device
                DeviceName                                                      \\.\DISPLAY2
                DeviceString                                                    NVIDIA GeForce GTX 680
                StateFlags                                                      524288
                DeviceID                                                        PCI\VEN_10DE&DEV_1180&SUBSYS_353C1458&REV_A1
                DeviceKey                                                       \Registry\Machine\System\CurrentControlSet\Control\Video\{B2084E8B-2720-449A-AED9-E38B350A4ECC}\0001
            Display Device
                DeviceName                                                      \\.\DISPLAY3
                DeviceString                                                    NVIDIA GeForce GTX 680
                StateFlags                                                      524288
                DeviceID                                                        PCI\VEN_10DE&DEV_1180&SUBSYS_353C1458&REV_A1
                DeviceKey                                                       \Registry\Machine\System\CurrentControlSet\Control\Video\{B2084E8B-2720-449A-AED9-E38B350A4ECC}\0002
            Display Device
                DeviceName                                                      \\.\DISPLAY4
                DeviceString                                                    NVIDIA GeForce GTX 680
                StateFlags                                                      524288
                DeviceID                                                        PCI\VEN_10DE&DEV_1180&SUBSYS_353C1458&REV_A1
                DeviceKey                                                       \Registry\Machine\System\CurrentControlSet\Control\Video\{B2084E8B-2720-449A-AED9-E38B350A4ECC}\0003
        GlobalMemoryStatusEx
            dwMemoryLoad                                                        20
            ullTotalPhys                                                        17120661504
            ullAvailPhys                                                        13528412160
            ullTotalPageFile                                                    20266389504
            ullAvailPageFile                                                    16434880512
            ullTotalVirtual                                                     140737488224256
            ullAvailVirtual                                                     140737435721728
        Processor Information
            0
                Name                                                            Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
                Clock speed (MHz)                                               3510
            1
                Name                                                            Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
                Clock speed (MHz)                                               3510
            2
                Name                                                            Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
                Clock speed (MHz)                                               3510
            3
                Name                                                            Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
                Clock speed (MHz)                                               3510
            4
                Name                                                            Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
                Clock speed (MHz)                                               3510
            5
                Name                                                            Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
                Clock speed (MHz)                                               3510
            6
                Name                                                            Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
                Clock speed (MHz)                                               3510
            7
                Name                                                            Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
                Clock speed (MHz)                                               3510
    NvAPI
        DisplayDriverVersion
            Driver Version                                                      35850
            Changelist                                                          0
            BuildBranchString                                                   r358_00
            Default AdapterString                                               GeForce GTX 680
        DisplayDriverCompileType                                                Release
    NvDebugApi
        WDDM Devices
            GPU
                Name                                                            GeForce GTX 680
                Architecture                                                    Kepler
                Architecture Number                                             224
                Architecture Implementation                                     4
                Architecture Revision                                           162
                Number of GPCs                                                  4
                Number of TPCs                                                  8
                Number of SMs                                                   8
                Warps per SM                                                    64
                Lanes per warp                                                  32
                Register file size                                              65536
                Max CTAs per SM                                                 16
                Max size of shared memory per CTA (bytes)                       49152
                SM Revision                                                     196608
                Number of FB PAs                                                6
                Number of LTs per LTC                                           4
                RmGpuId                                                         256
                Monitor
                    Windows Name                                                SyncMaster 2243NW/2243NWX
                    EDID Manufacturer                                           Samsung
                    EDID Model                                                  SyncMaster
        RM Devices
    CUDA
        CUDA Device
            Name                                                                GeForce GTX 680
            Driver                                                              WDDM
            DeviceIndex                                                         0
            GPU Family                                                          GK104
            RmGpuId                                                             256
            Compute Major                                                       3
            Compute Minor                                                       0
            MAX_THREADS_PER_BLOCK                                               1024
            MAX_BLOCK_DIM_X                                                     1024
            MAX_BLOCK_DIM_Y                                                     1024
            MAX_BLOCK_DIM_Z                                                     64
            MAX_GRID_DIM_X                                                      2147483647
            MAX_GRID_DIM_Y                                                      65535
            MAX_GRID_DIM_Z                                                      65535
            MAX_SHARED_MEMORY_PER_BLOCK                                         49152
            TOTAL_CONSTANT_MEMORY                                               65536
            WARP_SIZE                                                           32
            MAX_PITCH                                                           2147483647
            MAX_REGISTERS_PER_BLOCK                                             65536
            CLOCK_RATE                                                          1137000
            TEXTURE_ALIGNMENT                                                   512
            GPU_OVERLAP                                                         1
            MULTIPROCESSOR_COUNT                                                8
            KERNEL_EXEC_TIMEOUT                                                 1
            INTEGRATED                                                          0
            CAN_MAP_HOST_MEMORY                                                 1
            COMPUTE_MODE                                                        0
            MAXIMUM_TEXTURE1D_WIDTH                                             65536
            MAXIMUM_TEXTURE2D_WIDTH                                             65536
            MAXIMUM_TEXTURE2D_HEIGHT                                            65536
            MAXIMUM_TEXTURE3D_WIDTH                                             4096
            MAXIMUM_TEXTURE3D_HEIGHT                                            4096
            MAXIMUM_TEXTURE3D_DEPTH                                             4096
            MAXIMUM_TEXTURE2D_LAYERED_WIDTH                                     16384
            MAXIMUM_TEXTURE2D_LAYERED_HEIGHT                                    16384
            MAXIMUM_TEXTURE2D_LAYERED_LAYERS                                    2048
            SURFACE_ALIGNMENT                                                   512
            CONCURRENT_KERNELS                                                  1
            ECC_ENABLED                                                         0
            PCI_BUS_ID                                                          1
            PCI_DEVICE_ID                                                       0
            TCC_DRIVER                                                          0
            MEMORY_CLOCK_RATE                                                   3004000
            GLOBAL_MEMORY_BUS_WIDTH                                             256
            L2_CACHE_SIZE                                                       524288
            MAX_THREADS_PER_MULTIPROCESSOR                                      2048
            ASYNC_ENGINE_COUNT                                                  1
            UNIFIED_ADDRESSING                                                  1
            MAXIMUM_TEXTURE1D_LAYERED_WIDTH                                     16384
            MAXIMUM_TEXTURE1D_LAYERED_LAYERS                                    2048
            CAN_TEX2D_GATHER                                                    1
            MAXIMUM_TEXTURE2D_GATHER_WIDTH                                      16384
            MAXIMUM_TEXTURE2D_GATHER_HEIGHT                                     16384
            MAXIMUM_TEXTURE3D_WIDTH_ALTERNATE                                   2048
            MAXIMUM_TEXTURE3D_HEIGHT_ALTERNATE                                  2048
            MAXIMUM_TEXTURE3D_DEPTH_ALTERNATE                                   16384
            PCI_DOMAIN_ID                                                       0
            TEXTURE_PITCH_ALIGNMENT                                             32
            MAXIMUM_TEXTURECUBEMAP_WIDTH                                        16384
            MAXIMUM_TEXTURECUBEMAP_LAYERED_WIDTH                                16384
            MAXIMUM_TEXTURECUBEMAP_LAYERED_LAYERS                               2046
            MAXIMUM_SURFACE1D_WIDTH                                             65536
            MAXIMUM_SURFACE2D_WIDTH                                             65536
            MAXIMUM_SURFACE2D_HEIGHT                                            32768
            MAXIMUM_SURFACE3D_WIDTH                                             65536
            MAXIMUM_SURFACE3D_HEIGHT                                            32768
            MAXIMUM_SURFACE3D_DEPTH                                             2048
            MAXIMUM_SURFACE1D_LAYERED_WIDTH                                     65536
            MAXIMUM_SURFACE1D_LAYERED_LAYERS                                    2048
            MAXIMUM_SURFACE2D_LAYERED_WIDTH                                     65536
            MAXIMUM_SURFACE2D_LAYERED_HEIGHT                                    32768
            MAXIMUM_SURFACE2D_LAYERED_LAYERS                                    2048
            MAXIMUM_SURFACECUBEMAP_WIDTH                                        32768
            MAXIMUM_SURFACECUBEMAP_LAYERED_WIDTH                                32768
            MAXIMUM_SURFACECUBEMAP_LAYERED_LAYERS                               2046
            MAXIMUM_TEXTURE1D_LINEAR_WIDTH                                      134217728
            MAXIMUM_TEXTURE2D_LINEAR_WIDTH                                      65000
            MAXIMUM_TEXTURE2D_LINEAR_HEIGHT                                     65000
            MAXIMUM_TEXTURE2D_LINEAR_PITCH                                      1048544
            MAXIMUM_TEXTURE2D_MIPMAPPED_WIDTH                                   16384
            MAXIMUM_TEXTURE2D_MIPMAPPED_HEIGHT                                  16384
            MAXIMUM_TEXTURE1D_MIPMAPPED_WIDTH                                   16384
            STREAM_PRIORITIES_SUPPORTED                                         0
            GLOBAL_L1_CACHE_SUPPORTED                                           0
            LOCAL_L1_CACHE_SUPPORTED                                            1
            MAX_SHARED_MEMORY_PER_MULTIPROCESSOR                                49152
            MAX_REGISTERS_PER_MULTIPROCESSOR                                    65536
            MANAGED_MEMORY                                                      1
            MULTI_GPU_BOARD                                                     0
            MULTI_GPU_BOARD_GROUP_ID                                            0
            DISPLAY_NAME                                                        GeForce GTX 680
            COMPUTE_CAPABILITY_MAJOR                                            3
            COMPUTE_CAPABILITY_MINOR                                            0
            TOTAL_MEMORY                                                        2147483648
            RAM_TYPE                                                            8
            RAM_LOCATION                                                        1
            GPU_PCI_DEVICE_ID                                                   293605598
            GPU_PCI_SUB_SYSTEM_ID                                               893129816
            GPU_PCI_REVISION_ID                                                 161
            GPU_PCI_EXT_DEVICE_ID                                               4480
            GPU_PCI_EXT_GEN                                                     2
            GPU_PCI_EXT_GPU_GEN                                                 2
            GPU_PCI_EXT_GPU_LINK_RATE                                           8000
            GPU_PCI_EXT_GPU_LINK_WIDTH                                          16
            GPU_PCI_EXT_DOWNSTREAM_LINK_RATE                                    8000
            GPU_PCI_EXT_DOWNSTREAM_LINK_WIDTH                                   16
    OpenCL
        Platform
            Profile                                                             FULL_PROFILE
            Version                                                             OpenCL 1.2 CUDA 7.5.0
            Name                                                                NVIDIA CUDA
            Vendor                                                              NVIDIA Corporation
            Extensions                                                          cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
            Device
                CL_DEVICE_ADDRESS_BITS                                          64
                CL_DEVICE_AVAILABLE                                             True
                CL_DEVICE_COMPILER_AVAILABLE                                    True
                CL_DEVICE_DOUBLE_FP_CONFIG
                    CL_FP_DENORM                                                True
                    CL_FP_INF_NAN                                               True
                    CL_FP_DENORM                                                True
                    CL_FP_ROUND_TO_NEAREST                                      True
                    CL_FP_ROUND_TO_ZERO                                         True
                    CL_FP_FMA                                                   True
                CL_DEVICE_ENDIAN_LITTLE                                         True
                CL_DEVICE_ERROR_CORRECTION_SUPPORT                              False
                CL_DEVICE_EXECUTION_CAPABILITIES
                    CL_EXEC_KERNEL                                              True
                    CL_EXEC_NATIVE_KERNEL                                       False
                CL_DEVICE_EXTENSIONS                                            cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
                CL_DEVICE_GLOBAL_MEM_CACHE_SIZE                                 131072
                CL_DEVICE_GLOBAL_MEM_CACHE_TYPE                                 RW
                CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE                             128
                CL_DEVICE_GLOBAL_MEM_SIZE                                       2147483648
                CL_DEVICE_HALF_FP_CONFIG                                        Information not available
                CL_DEVICE_HOST_UNIFIED_MEMORY                                   False
                CL_DEVICE_IMAGE_SUPPORT                                         True
                CL_DEVICE_IMAGE2D_MAX_HEIGHT                                    16384
                CL_DEVICE_IMAGE2D_MAX_WIDTH                                     16384
                CL_DEVICE_IMAGE3D_MAX_DEPTH                                     4096
                CL_DEVICE_IMAGE3D_MAX_HEIGHT                                    4096
                CL_DEVICE_IMAGE3D_MAX_WIDTH                                     4096
                CL_DEVICE_LOCAL_MEM_SIZE                                        49152
                CL_DEVICE_LOCAL_MEM_TYPE                                        Local
                CL_DEVICE_MAX_CLOCK_FREQUENCY                                   1137
                CL_DEVICE_MAX_COMPUTE_UNITS                                     8
                CL_DEVICE_MAX_CONSTANT_ARGS                                     9
                CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE                              65536
                CL_DEVICE_MAX_MEM_ALLOC_SIZE                                    536870912
                CL_DEVICE_MAX_PARAMETER_SIZE                                    4352
                CL_DEVICE_MAX_READ_IMAGE_ARGS                                   256
                CL_DEVICE_MAX_SAMPLERS                                          32
                CL_DEVICE_MAX_WORK_GROUP_SIZE                                   1024
                CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS                              3
                CL_DEVICE_MAX_WORK_ITEM_SIZES
                    0                                                           1024
                    1                                                           1024
                    2                                                           64
                CL_DEVICE_MAX_WRITE_IMAGE_ARGS                                  16
                CL_DEVICE_MEM_BASE_ADDR_ALIGN                                   4096
                CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE                              128
                CL_DEVICE_NAME                                                  GeForce GTX 680
                CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR                              1
                CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT                             1
                CL_DEVICE_NATIVE_VECTOR_WIDTH_INT                               1
                CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG                              1
                CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT                             1
                CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE                            1
                CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF                              0
                CL_DEVICE_OPENCL_C_VERSION                                      OpenCL C 1.2 
                CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR                           1
                CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT                          1
                CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT                            1
                CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG                           1
                CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT                          1
                CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE                         1
                CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF                           0
                CL_DEVICE_PROFILE                                               FULL_PROFILE
                CL_DEVICE_PROFILING_TIMER_RESOLUTION                            1000
                CL_DEVICE_QUEUE_PROPERTIES
                    CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE                      True
                    CL_QUEUE_PROFILING_ENABLE                                   True
                CL_DEVICE_SINGLE_FP_CONFIG
                    CL_FP_DENORM                                                True
                    CL_FP_INF_NAN                                               True
                    CL_FP_DENORM                                                True
                    CL_FP_ROUND_TO_NEAREST                                      True
                    CL_FP_ROUND_TO_ZERO                                         True
                    CL_FP_FMA                                                   True
                CL_DEVICE_TYPE
                    CL_DEVICE_TYPE_DEFAULT                                      False
                    CL_DEVICE_TYPE_CPU                                          False
                    CL_DEVICE_TYPE_GPU                                          True
                    CL_DEVICE_TYPE_ACCELERATOR                                  False
                CL_DEVICE_VENDOR                                                NVIDIA Corporation
                CL_DEVICE_VENDOR_ID                                             4318
                CL_DEVICE_VERSION                                               OpenCL 1.2 CUDA
                CL_DRIVER_VERSION                                               358.50
                Profile                                                         FULL_PROFILE
                Version                                                         OpenCL 1.2 CUDA 7.5.0
                Name                                                            NVIDIA CUDA
                Vendor                                                          NVIDIA Corporation
                Extensions                                                      cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
    Tools
        Host Version                                                            5.0.0.15294
        Host Application Version                                                14.0

OK, I solved it. The reason why it does not display anything is that I switched my app to a 64-bit architecture a while back. Coincidentally, NOWHERE in the Nsight requirements is it mentioned that only 32-bit apps are supported. And since 5.0 is not supported on 32-bit operating systems, I automatically presumed it supports 64-bit apps (maybe only). To the Nsight team: If this is true, please update information accordingly.

The other option is that only when using SDL this problem occurs. I will try to find out.

So, switching back to 32-bit solved it, everything works OK now.

I did a quick recompile, and a very simple WinAPI 64-bit app is not showing the in-window Nsight HUD. A WinAPI 32-bit app does work OK. So it is not related only to SDL, it appears the whole 64-bit architecture is a problem for Nsight.

Hi michalferko,

Nsight doesn’t support 32bit OS from Nsight 5.0, but do support 32bit app runs on 64bit OS and 64bit app runs on 64bit OS.

I am 100% sure Nsight support 64bit app, can you try to build this: https://github.com/Microsoft/DirectX-Graphics-Samples/tree/master/Samples/D3D12HelloWorld , this is Win10 DX12 sample codes, and it’s set to x64 target by default. It works fine on my 64bit OS with Nsight.

I am thinking there might be something else which bring the issue to you, could you confirm:
- your sample is a native c++ project? not a .Net managed project?
- do you compile all related libraries and external libraries to x64?
- can you run your x64 bit sample smoothly even without Nsight?

Thanks
An