cudaMallocHost crash since update from cuda 7.0.28 to 8.0.44

Hi

I have a quite modern station with 128GB RAM (8*16 DDR4), two intel xeon E5-2620v3 and 8 Titan X GPUs with 12GB DDR5 each running on Widnows 10x64 and recently updated my software from CUDA 7.0.28 to 8.0.44. All my GPUs are in TCC mode.

Additional info about my software: my code needs a lot of memroy (about 30GB needed form 128RAM). It interatively calls the function cudaHostMalloc with 256256256*8 bytes, usually with cudaHostAllocPortable flag (but it also crashes without that flag using cudaMallocHost, i tested it)

Before my update my code works without problems up to those 30 GB on that station. Since i updated to 8.0.44 (also driver installer) i get a “Out of Memory” error returned form that malloc function after about 50-60 calls which are below 10GB and far below 128GB…

After updating my driver i also tested my old compilation with links cuda 7, also there i now get that out of memory crash. so it seems that the reason is on driver and not my binaries since my old code worked on old driver…

any suggestion`s?

hmm, seems to be same problem than here:
https://devtalk.nvidia.com/default/topic/970188/cuda-programming-and-performance/large-memory-allocation-with-cudahostalloc-fails-with-cuda-8-0-release-build/

This seems reproducible by a lot of people (including me), so it would really be good if someone from NVIDIA could comment on this. Thanks!

Latest driver 376.84 still does not fix it. It is now 4 months since the issue was reported (see link posted by GeraldLodron).

Latest driver 377.35 still does not fix it.

Latest driver 385.08 seems to fix it.