Hallo,
i have the same problems since november last year. I already posted it as Bug to Nvidia Jan 25 with several updates and no reaction of NVIDIA to it. And i tried to post it here two times. Possibly nobody could help me because my posts could be confusing, because i am not a native speaker.
In my case my software (three programms) becomes much slover after changing from Maxwell to PASCAL GPUs. I have a complex memory structure (Calls to array give my adresses of other arrays), so the descriped behavior of this forum post hits me very hard. In Nsight a performance analysis shows the same behavior like Zero-Copy-Memory on Pascal Cards, on Maxwell Cards all works well. But i really would need the new memory paging feature of Pascal GPUs.
I can confirm this behavior in diffrent combinations: On two Systems Win 7 VS 2013 or VS 2015 CUDA 8 with GTX 1080, priviously GTX 980Ti. And one system Win 10 VS 2013 or VS 2015 CUDA 8 with GTX 1060 priovusly GTX960. Tested all driver and cuda versions since november 2016. All the between Maxwell and Pascal where done by only changing the Graphic Card and nothing else.
NSight System Info also reports:
Concurrent Managed Access FALSE
Managed Memory TRUE
Pageable Memory Access FALSE
Also there are no memory paging reported during a Nsight perormance analysis.
On all system the programs with the problem run perfectly under Linux (Ubuntu) and Nsight correctly reports there the use of Memory Paging for CPU and GPU.
nVidia Visual Profiler also reports in my case "==290352== Warning: Unified Memory Profiling is not supported on the current configuration because a pair of devices without peer-to-peer support is detected on this multi-GPU setup. But i also only have single GPU systems.
Performane example:
Noise reduction with non linear-mean-filter of thousend of spectra: GTX 980 Ti Windows 15minutes, GTX 1080 Windows 90 minutes, GTX 1080 Linux nearly 8minutes.
Interactive Mutlispectral Visualization of confocal raman microscopy: GTX 1080 Windows 0.25 frames a second: GTX 1080 Linux abouve 100 frames a second. (Due to complex data structure)
Interessting seams to be that my cards are also all from Gigabyte. GTX 1080 Xtreme Gaming and GTX 1060 Xtreme Gaming. So it really could be an Gigabyte problem.
I hope my description helps a bit. Please inform me when you found somthing out.
I am now doing my work on Linux.