Description:
When sampling `elapsed_cycles_sm` by using continuous sample mode in CUPTI library, `nvidia-smi` command tool shows 0% GPU-Util. However it does work when sampling `inst_executed` event.
How to reproduce:
Modify the official example file `event_sampling/event_sampling.cu`, change the macro definition of 'EVENT_NAME' from 'inst_executed' to 'elapsed_cycles_sm', build and run this concurrently with `nvidia-smi`, the GPU-Util in `nvidia-smi` does't change at all.
Tested platform:
GPU: K40m, P100
Cuda driver:
NVRM version: NVIDIA UNIX x86_64 Kernel Module 396.26 Mon Apr 30 18:01:39 PDT 2018
GCC version: gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC)
nvidia-smi version:
NVIDIA-SMI 396.26 Driver Version: 396.26