[RESOLVED] Profiling error 4168:999

I could bypass the problem by executing nvprof on root permission using sudo. Maybe nvprof requires more permission rather than nvidia-persistenced group. I’m using Ubuntu 16.04.3, CUDA 9.1 on GeForce GTX Titan(Kepler).

Dear NVIDIA community, this is my first post here se let me say hello to all of you!

I have the same issue: Internal profiling error 4168:999
Windows 10, driver 388.19, CUDA Toolkit 9.1.85.

However profiling of my app was running fine (the same system, driver and toolkit) until I’ve started to use cudaMemcpy2DToArrayAsync (sync or async version doesn’t matter). With just commenting out this one function profiling is going back to normal. Adding / removing it doesn’t affect correctness of standalone execution of the app. Any suggestions from your side?

Hi, stanislaw.warych

Please cuda-memcheck XXX.exe to check if your app has memory issue.

Thanks!

If possible, can you share the minimum program that can reproduce the problem ?

Hi,

Thanks for your answer! “cuda-memcheck” done, nothing found except cudaFree / cudaFreeHost mismatch at application end.

Minimum program is challenging to be minimum as it is CUDA <-> D3D interoperability and only with this functionality I have problems. However I made sample code creating D3D texture, mapping that to CUDA array and copying into that content of some GPU memory filled with memset. Memcheck is clean, profiler returns: “Internal profiling error 4047:999”. Code is different but again if I comment out cudaMemcpy2DToArray it runs well. Quick Google check didn’t tell me anything about this error code. Shall I post this sample here (almost 150 lines)?

Hi,

That would be great if you can provide the sample code.

I will send you private message about how to upload.

Thanks!

Hi, stanislaw.warych

I can reproduce your issue and will submit an internal bug to dev.

Thanks for your help.

Any response, I will back to you ASAP.

Hi,

so anything new on this issue (internal profiling error 4168:999)?

i still run into this problem.

bernhard

Hi, bernhardh

Are you meeting this error also due to cudaMemcpy2DToArray used in your code ?

For this issue, we have reproduced, and dev will try to fix in later release.

Hi,

yes i am using cudaMemcpy2DToArray in my code.

is there a workaround besides dropping back to an older version?

as it seems its still working to profile with 8.0

i can not drop cudaMemcpy2DToArray as i need interop between CUDA and OpenGL

Hi,

We have verified this issue already fixed in latest version.
But I’m afraid you still need wait some time to get this.

Has this been fixed in the 9.2 release? If not, any eta on the fix?

thanks

Hi,roastam

I’m afraid 9.2 still can reproduce it.

The next release can not reproduce, but I am not sure the release data.

I am still seeing this.

02:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX 980 Ti] (rev a1)

Linux 3.10.0-862.2.3.el7.x86_64

nvprof: NVIDIA (R) Cuda command line profiler
Copyright (c) 2012 - 2018 NVIDIA Corporation
Release version 10.0.130 (21)

==12451== Some kernel(s) will be replayed on device 0 in order to collect all events/metrics.
Replaying kernel “xxxxxxx(unsigned int*, unsigned int*, unsigned int*, unsigned int, unsigned int*, unsigned int*, int, unsigned int*, unsigned int*)” (19 of 52)…
2 internal events
==12451== Error: Internal profiling error 4183:999.
Kernel launch failed: unknown error
Total runtime seconds : 17.698425


George

This helps. Thanks. I’m running Windows 10, driver 425.25, CUDA 10.1. Have to run as administrator to get tiemline even for a CUDA sample.

This solution works for my setup too.
Version: 11.0 - Quadro M1200 - Windows 10