Slower activation from Matlab

Hi everyone!

I am currently developing for cuda, my device is GeForce GTX 480 and I am working
under linux environment.

The function I wrote is launched repeatedly. (infinite while loop)
Since I need a graphical interface, I thought of using matlab for launching the kernel.

My code line for generating is this:
[gpu_dvec_ping,gpu_lr_a,gpu_lr_b] = feval(control_handle,gpu_dvec_ping,gpu_lr_a,gpu_lr_b,gpu_dvec_pong,gpu_q,Ns);
All the variables are type gpuArray (meaning they are transferred to the device memory before launching.

When I measured the time, I noticed that there is a huge difference in the running time of the kernel.
When I launch from matlab it takes about 10 times longer than when I am launching from regular C program.

This makes no sense! I am sure I am doing something wrong.
What can possibly be delaying the kernel? Is there any data transfers between the device and the host
for some reason?

Thanks in advance

Ariel

It does make sense when you realize that gpuArray’s are really bad :)

Learn more about this on the compare page.

Of course, my disclaimer is that I’m biased by my belief that GPU computing in MATLAB(R) should be fast and I work on Jacket. Send me your code and we’ll show you what good GPU software can do!