SOLUTION!!!
OK, I solved it. The reason why it does not display anything is that I switched my app to a 64-bit architecture a while back. Coincidentally, NOWHERE in the Nsight requirements is it mentioned that only 32-bit apps are supported. And since 5.0 is not supported on 32-bit operating systems, I automatically presumed it supports 64-bit apps (maybe only). To the Nsight team: If this is true, please update information accordingly.
I did a quick recompile, and a very simple WinAPI 64-bit app is not showing the in-window Nsight HUD. A WinAPI 32-bit app does work OK. So it is not related only to SDL, it appears the whole 64-bit architecture is a problem for Nsight.
So, switching back to 32-bit solved it for WinAPI as well as SDL, everything works OK now.
Original question:
Hi, I am pretty desperate, I cannot get the graphics debugger to start at all. I’ve tried everything apart from downgrading to Windows 8.
I have triple-checked if I meet all requirements for NSight. Here’s my configuration:
Intel Core i7-3770K, 16 GB RAM, ASUS P8Z77 Pro
Gigabyte GeForce GTX 680, 2GB RAM, driver version 358.50 (newest)
Win 10 Pro, fully updated, Visual Studio 2015 Community, Nsight 5.0.0.15294 (newest)
The requirements page is confusing, since in the table you mention that Graphics Shader Debugger is available on a single GPU system, but the bullet list below suggests that on a single GPU, only CUDA debugging is supported. Please explain what is actually correct. I am referring to this link: Nsight Visual Studio Edition Requirements | NVIDIA Developer
Should I connect to the machine remotely, since it only has one GPU? Or should I enable the built-in Intel GPU (and connect my monitor to it) on the processor and use the NVIDIA GPU as a compute-only GPU. Does that even work with Graphics debugging, where you require it to output to a window?
On the same machine, with an older Nsight (I do not remember the version) and older system (Win 8), I was debugging Graphics just fine (it’s been over a year ago, maybe more). It was after seeing this video:
So I guess Visual Studio 2012 + Nsight 3.2 + Win 8 worked flawlessly. The HUD showed up after app start, and the Frame Debugger and Shader Debugger worked.
I am using SDL + OpenGL + GLSL + GLEW. There are a few GLSL compute shaders in my project, could these be causing the problem? I remember way back that Nsight showed a warning when I was using a feature that was not supported. Nothing like this happens now. I was using WinAPI before for window creation instead of SDL. Could this be the problem?
Starting Graphics Debugging shows no error, only displays the classic warning about TDR being only 2 seconds. And it does not complain about not being able to connect, so the Nsight Monitor is definitely getting the data.
Starting an OpenGL app that just clears the framebuffer does not work as well.
Please help me. Did I miss some configuration? I left all settings on default. My work goes much slower when I cannot do shader debugging and this would help me spare countless hours. Thank you.
Here’s the output of System info generated by Nsight from Visual Studio:
Report Information
UnixTime Generated 1446559935
OS Information
Computer Name Corsair
NetBIOS Name CORSAIR
OS Name Windows 10 Pro
GetVersionEx
dwMajorVersion 10
dwMinorVersion 0
dwBuildNumber 10240
dwPlatformId 2
wServicePackMajor 0
wServicePackMinor 0
wSuiteMask 256
wProductType Workstation
GetProductInfo 48
GetNativeSystemInfo
wProcessorArchitecture x64
dwPageSize 4096
lpMinimumApplicationAddress 65536
lpMaximumApplicationAddress 140737488289791
dwActiveProcessorMask 255
dwNumberOfProcessors 8
dwAllocationGranularity 65536
wProcessorLevel 6
wProcessorRevision 14857
EnumDisplayDevices
Display Device
DeviceName \\.\DISPLAY1
DeviceString NVIDIA GeForce GTX 680
StateFlags 524293
DeviceID PCI\VEN_10DE&DEV_1180&SUBSYS_353C1458&REV_A1
DeviceKey \Registry\Machine\System\CurrentControlSet\Control\Video\{B2084E8B-2720-449A-AED9-E38B350A4ECC}\0000
Monitor
DeviceName \\.\DISPLAY1\Monitor0
DeviceString SyncMaster 2243NW/2243NWX
StateFlags 3
DeviceID MONITOR\SAM03C2\{4d36e96e-e325-11ce-bfc1-08002be10318}\0002
DeviceKey \Registry\Machine\System\CurrentControlSet\Control\Class\{4d36e96e-e325-11ce-bfc1-08002be10318}\0002
Display Device
DeviceName \\.\DISPLAY2
DeviceString NVIDIA GeForce GTX 680
StateFlags 524288
DeviceID PCI\VEN_10DE&DEV_1180&SUBSYS_353C1458&REV_A1
DeviceKey \Registry\Machine\System\CurrentControlSet\Control\Video\{B2084E8B-2720-449A-AED9-E38B350A4ECC}\0001
Display Device
DeviceName \\.\DISPLAY3
DeviceString NVIDIA GeForce GTX 680
StateFlags 524288
DeviceID PCI\VEN_10DE&DEV_1180&SUBSYS_353C1458&REV_A1
DeviceKey \Registry\Machine\System\CurrentControlSet\Control\Video\{B2084E8B-2720-449A-AED9-E38B350A4ECC}\0002
Display Device
DeviceName \\.\DISPLAY4
DeviceString NVIDIA GeForce GTX 680
StateFlags 524288
DeviceID PCI\VEN_10DE&DEV_1180&SUBSYS_353C1458&REV_A1
DeviceKey \Registry\Machine\System\CurrentControlSet\Control\Video\{B2084E8B-2720-449A-AED9-E38B350A4ECC}\0003
GlobalMemoryStatusEx
dwMemoryLoad 20
ullTotalPhys 17120661504
ullAvailPhys 13528412160
ullTotalPageFile 20266389504
ullAvailPageFile 16434880512
ullTotalVirtual 140737488224256
ullAvailVirtual 140737435721728
Processor Information
0
Name Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
Clock speed (MHz) 3510
1
Name Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
Clock speed (MHz) 3510
2
Name Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
Clock speed (MHz) 3510
3
Name Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
Clock speed (MHz) 3510
4
Name Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
Clock speed (MHz) 3510
5
Name Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
Clock speed (MHz) 3510
6
Name Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
Clock speed (MHz) 3510
7
Name Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
Clock speed (MHz) 3510
NvAPI
DisplayDriverVersion
Driver Version 35850
Changelist 0
BuildBranchString r358_00
Default AdapterString GeForce GTX 680
DisplayDriverCompileType Release
NvDebugApi
WDDM Devices
GPU
Name GeForce GTX 680
Architecture Kepler
Architecture Number 224
Architecture Implementation 4
Architecture Revision 162
Number of GPCs 4
Number of TPCs 8
Number of SMs 8
Warps per SM 64
Lanes per warp 32
Register file size 65536
Max CTAs per SM 16
Max size of shared memory per CTA (bytes) 49152
SM Revision 196608
Number of FB PAs 6
Number of LTs per LTC 4
RmGpuId 256
Monitor
Windows Name SyncMaster 2243NW/2243NWX
EDID Manufacturer Samsung
EDID Model SyncMaster
RM Devices
CUDA
CUDA Device
Name GeForce GTX 680
Driver WDDM
DeviceIndex 0
GPU Family GK104
RmGpuId 256
Compute Major 3
Compute Minor 0
MAX_THREADS_PER_BLOCK 1024
MAX_BLOCK_DIM_X 1024
MAX_BLOCK_DIM_Y 1024
MAX_BLOCK_DIM_Z 64
MAX_GRID_DIM_X 2147483647
MAX_GRID_DIM_Y 65535
MAX_GRID_DIM_Z 65535
MAX_SHARED_MEMORY_PER_BLOCK 49152
TOTAL_CONSTANT_MEMORY 65536
WARP_SIZE 32
MAX_PITCH 2147483647
MAX_REGISTERS_PER_BLOCK 65536
CLOCK_RATE 1137000
TEXTURE_ALIGNMENT 512
GPU_OVERLAP 1
MULTIPROCESSOR_COUNT 8
KERNEL_EXEC_TIMEOUT 1
INTEGRATED 0
CAN_MAP_HOST_MEMORY 1
COMPUTE_MODE 0
MAXIMUM_TEXTURE1D_WIDTH 65536
MAXIMUM_TEXTURE2D_WIDTH 65536
MAXIMUM_TEXTURE2D_HEIGHT 65536
MAXIMUM_TEXTURE3D_WIDTH 4096
MAXIMUM_TEXTURE3D_HEIGHT 4096
MAXIMUM_TEXTURE3D_DEPTH 4096
MAXIMUM_TEXTURE2D_LAYERED_WIDTH 16384
MAXIMUM_TEXTURE2D_LAYERED_HEIGHT 16384
MAXIMUM_TEXTURE2D_LAYERED_LAYERS 2048
SURFACE_ALIGNMENT 512
CONCURRENT_KERNELS 1
ECC_ENABLED 0
PCI_BUS_ID 1
PCI_DEVICE_ID 0
TCC_DRIVER 0
MEMORY_CLOCK_RATE 3004000
GLOBAL_MEMORY_BUS_WIDTH 256
L2_CACHE_SIZE 524288
MAX_THREADS_PER_MULTIPROCESSOR 2048
ASYNC_ENGINE_COUNT 1
UNIFIED_ADDRESSING 1
MAXIMUM_TEXTURE1D_LAYERED_WIDTH 16384
MAXIMUM_TEXTURE1D_LAYERED_LAYERS 2048
CAN_TEX2D_GATHER 1
MAXIMUM_TEXTURE2D_GATHER_WIDTH 16384
MAXIMUM_TEXTURE2D_GATHER_HEIGHT 16384
MAXIMUM_TEXTURE3D_WIDTH_ALTERNATE 2048
MAXIMUM_TEXTURE3D_HEIGHT_ALTERNATE 2048
MAXIMUM_TEXTURE3D_DEPTH_ALTERNATE 16384
PCI_DOMAIN_ID 0
TEXTURE_PITCH_ALIGNMENT 32
MAXIMUM_TEXTURECUBEMAP_WIDTH 16384
MAXIMUM_TEXTURECUBEMAP_LAYERED_WIDTH 16384
MAXIMUM_TEXTURECUBEMAP_LAYERED_LAYERS 2046
MAXIMUM_SURFACE1D_WIDTH 65536
MAXIMUM_SURFACE2D_WIDTH 65536
MAXIMUM_SURFACE2D_HEIGHT 32768
MAXIMUM_SURFACE3D_WIDTH 65536
MAXIMUM_SURFACE3D_HEIGHT 32768
MAXIMUM_SURFACE3D_DEPTH 2048
MAXIMUM_SURFACE1D_LAYERED_WIDTH 65536
MAXIMUM_SURFACE1D_LAYERED_LAYERS 2048
MAXIMUM_SURFACE2D_LAYERED_WIDTH 65536
MAXIMUM_SURFACE2D_LAYERED_HEIGHT 32768
MAXIMUM_SURFACE2D_LAYERED_LAYERS 2048
MAXIMUM_SURFACECUBEMAP_WIDTH 32768
MAXIMUM_SURFACECUBEMAP_LAYERED_WIDTH 32768
MAXIMUM_SURFACECUBEMAP_LAYERED_LAYERS 2046
MAXIMUM_TEXTURE1D_LINEAR_WIDTH 134217728
MAXIMUM_TEXTURE2D_LINEAR_WIDTH 65000
MAXIMUM_TEXTURE2D_LINEAR_HEIGHT 65000
MAXIMUM_TEXTURE2D_LINEAR_PITCH 1048544
MAXIMUM_TEXTURE2D_MIPMAPPED_WIDTH 16384
MAXIMUM_TEXTURE2D_MIPMAPPED_HEIGHT 16384
MAXIMUM_TEXTURE1D_MIPMAPPED_WIDTH 16384
STREAM_PRIORITIES_SUPPORTED 0
GLOBAL_L1_CACHE_SUPPORTED 0
LOCAL_L1_CACHE_SUPPORTED 1
MAX_SHARED_MEMORY_PER_MULTIPROCESSOR 49152
MAX_REGISTERS_PER_MULTIPROCESSOR 65536
MANAGED_MEMORY 1
MULTI_GPU_BOARD 0
MULTI_GPU_BOARD_GROUP_ID 0
DISPLAY_NAME GeForce GTX 680
COMPUTE_CAPABILITY_MAJOR 3
COMPUTE_CAPABILITY_MINOR 0
TOTAL_MEMORY 2147483648
RAM_TYPE 8
RAM_LOCATION 1
GPU_PCI_DEVICE_ID 293605598
GPU_PCI_SUB_SYSTEM_ID 893129816
GPU_PCI_REVISION_ID 161
GPU_PCI_EXT_DEVICE_ID 4480
GPU_PCI_EXT_GEN 2
GPU_PCI_EXT_GPU_GEN 2
GPU_PCI_EXT_GPU_LINK_RATE 8000
GPU_PCI_EXT_GPU_LINK_WIDTH 16
GPU_PCI_EXT_DOWNSTREAM_LINK_RATE 8000
GPU_PCI_EXT_DOWNSTREAM_LINK_WIDTH 16
OpenCL
Platform
Profile FULL_PROFILE
Version OpenCL 1.2 CUDA 7.5.0
Name NVIDIA CUDA
Vendor NVIDIA Corporation
Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
Device
CL_DEVICE_ADDRESS_BITS 64
CL_DEVICE_AVAILABLE True
CL_DEVICE_COMPILER_AVAILABLE True
CL_DEVICE_DOUBLE_FP_CONFIG
CL_FP_DENORM True
CL_FP_INF_NAN True
CL_FP_DENORM True
CL_FP_ROUND_TO_NEAREST True
CL_FP_ROUND_TO_ZERO True
CL_FP_FMA True
CL_DEVICE_ENDIAN_LITTLE True
CL_DEVICE_ERROR_CORRECTION_SUPPORT False
CL_DEVICE_EXECUTION_CAPABILITIES
CL_EXEC_KERNEL True
CL_EXEC_NATIVE_KERNEL False
CL_DEVICE_EXTENSIONS cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE 131072
CL_DEVICE_GLOBAL_MEM_CACHE_TYPE RW
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE 128
CL_DEVICE_GLOBAL_MEM_SIZE 2147483648
CL_DEVICE_HALF_FP_CONFIG Information not available
CL_DEVICE_HOST_UNIFIED_MEMORY False
CL_DEVICE_IMAGE_SUPPORT True
CL_DEVICE_IMAGE2D_MAX_HEIGHT 16384
CL_DEVICE_IMAGE2D_MAX_WIDTH 16384
CL_DEVICE_IMAGE3D_MAX_DEPTH 4096
CL_DEVICE_IMAGE3D_MAX_HEIGHT 4096
CL_DEVICE_IMAGE3D_MAX_WIDTH 4096
CL_DEVICE_LOCAL_MEM_SIZE 49152
CL_DEVICE_LOCAL_MEM_TYPE Local
CL_DEVICE_MAX_CLOCK_FREQUENCY 1137
CL_DEVICE_MAX_COMPUTE_UNITS 8
CL_DEVICE_MAX_CONSTANT_ARGS 9
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE 65536
CL_DEVICE_MAX_MEM_ALLOC_SIZE 536870912
CL_DEVICE_MAX_PARAMETER_SIZE 4352
CL_DEVICE_MAX_READ_IMAGE_ARGS 256
CL_DEVICE_MAX_SAMPLERS 32
CL_DEVICE_MAX_WORK_GROUP_SIZE 1024
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS 3
CL_DEVICE_MAX_WORK_ITEM_SIZES
0 1024
1 1024
2 64
CL_DEVICE_MAX_WRITE_IMAGE_ARGS 16
CL_DEVICE_MEM_BASE_ADDR_ALIGN 4096
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE 128
CL_DEVICE_NAME GeForce GTX 680
CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR 1
CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT 1
CL_DEVICE_NATIVE_VECTOR_WIDTH_INT 1
CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG 1
CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT 1
CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE 1
CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF 0
CL_DEVICE_OPENCL_C_VERSION OpenCL C 1.2
CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR 1
CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT 1
CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT 1
CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG 1
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT 1
CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE 1
CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF 0
CL_DEVICE_PROFILE FULL_PROFILE
CL_DEVICE_PROFILING_TIMER_RESOLUTION 1000
CL_DEVICE_QUEUE_PROPERTIES
CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE True
CL_QUEUE_PROFILING_ENABLE True
CL_DEVICE_SINGLE_FP_CONFIG
CL_FP_DENORM True
CL_FP_INF_NAN True
CL_FP_DENORM True
CL_FP_ROUND_TO_NEAREST True
CL_FP_ROUND_TO_ZERO True
CL_FP_FMA True
CL_DEVICE_TYPE
CL_DEVICE_TYPE_DEFAULT False
CL_DEVICE_TYPE_CPU False
CL_DEVICE_TYPE_GPU True
CL_DEVICE_TYPE_ACCELERATOR False
CL_DEVICE_VENDOR NVIDIA Corporation
CL_DEVICE_VENDOR_ID 4318
CL_DEVICE_VERSION OpenCL 1.2 CUDA
CL_DRIVER_VERSION 358.50
Profile FULL_PROFILE
Version OpenCL 1.2 CUDA 7.5.0
Name NVIDIA CUDA
Vendor NVIDIA Corporation
Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
Tools
Host Version 5.0.0.15294
Host Application Version 14.0