I used to run “Quantify” against Non-CUDA C++ codes, which will show detailed line by line run time, and is quite helpful.
Anyone tried this for CUDA involved codes? I tried but it will not run properly.
I used to run “Quantify” against Non-CUDA C++ codes, which will show detailed line by line run time, and is quite helpful.
Anyone tried this for CUDA involved codes? I tried but it will not run properly.