Visual Profiler CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT Cupti error when running kernel memory ana

My current system is a GTX480 installed as a secondary non-graphical card with CUDA 4.1 on linux. I have several applications that have worked with the visual profiler, but for one particular application I am interested in profiling, I cannot get the Kernel Memory Analysis to run completely.

I get the following error after 11 runs of my program which reads:
“Unable to collect metric and event values.
CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT”

I’d like to know what I can do to get the dram_reads and dram_writes information so I can try to figure out where the bottlenecks in my code may lie. I’d also like to know if it is possible, and how to get around this error, which tells me in the Analysis view that there is “Insufficient Global Memory Load Data”. Is the command-line profiler still available in Cuda 4.1 and would that help?

Visual Profiler runs the application multiple times to collect all the counters required for the analysis. This error typically means that the same data was not collected on every run of the application.

Profiler data gets flushed to a file only at synchronization calls like cudaDeviceSynchronize() and cudaStreamSynchronize() or when the profiler buffer gets full. If an application terminates without these sync calls then profiler data may be lost and this can result in the error you are getting.

Command line profiler is available in Cuda 4.1. Refer the command line profiler user guide for details.