Some kernels do not seem to be properly profiled when profiling with NSight. They do not show up in the kernel launches, source view etc when doing "Profile CUDA Application. The output console does actually mention it is profiling these kernels. These kernels do also show in the kernel launches of “Trace Application”. Using a filter (Kernels to Profile) does not make a difference. To me this seems like a bug or compatibility issue of CUDA. Does anyone share this experience?
Environment:
Windows 7 64 bit Enterprise
Visual Studio 2013 (64 bit C++ project)
CUDA 7.5
Tesla K40