Execution time is not proportional to the time steps
Hi, all

I am running my program with GPU implementation, and I found something weird that the computational time of my gpu code is not proportional to my time steps.

For example, if it takes 1 unit time for my code to complete 1 time step, it will take 5 unit times for 5 time steps when running on CPU, but almost takes 9 unit times for 5 time steps on gpu.

Since the whole execution time includes accessing data and computation, I expect the requires time will be proportional to the time step, which conflicts with the result.

Please help me on figuring out the reason which causes this problem, thanks a lot.
Hi, all



I am running my program with GPU implementation, and I found something weird that the computational time of my gpu code is not proportional to my time steps.



For example, if it takes 1 unit time for my code to complete 1 time step, it will take 5 unit times for 5 time steps when running on CPU, but almost takes 9 unit times for 5 time steps on gpu.



Since the whole execution time includes accessing data and computation, I expect the requires time will be proportional to the time step, which conflicts with the result.



Please help me on figuring out the reason which causes this problem, thanks a lot.

#1
Posted 05/02/2012 09:35 AM   
Maybe you have an error in the measuring the time it take to complete on gpu. Unless you post some code here it is difficult to say for sure
Maybe you have an error in the measuring the time it take to complete on gpu. Unless you post some code here it is difficult to say for sure

#2
Posted 05/02/2012 11:01 AM   
[quote name='pasoleatis' date='02 May 2012 - 01:01 PM' timestamp='1335956507' post='1403352']
Maybe you have an error in the measuring the time it take to complete on gpu. Unless you post some code here it is difficult to say for sure
[/quote]

yeah, like a forgotten cudaThreadSynchronize() before the beginning and before the end of the measurement interval.
[quote name='pasoleatis' date='02 May 2012 - 01:01 PM' timestamp='1335956507' post='1403352']

Maybe you have an error in the measuring the time it take to complete on gpu. Unless you post some code here it is difficult to say for sure





yeah, like a forgotten cudaThreadSynchronize() before the beginning and before the end of the measurement interval.

#3
Posted 05/02/2012 02:20 PM   
Hi,

my time measurement is like that

[color="#9932CC"]#include[/color]<time.h>

[color="#008000"]clock_t[/color] start, end;

cudaMemcpy( d, h, size, [color="#FF0000"]cudaMemcpyHostToDevice[/color] );

start = clock();

[color="#0000FF"]for[/color] ( time = 0 ; time < maxtime, time++ )
{
kernel<<< grid, block >>>(...);
}

end = clock();

cudaMemcpy( h, d, size, [color="#FF0000"]cudaMemcpyDeviceToHost[/color] );

And then I just that computeTime = end - start to calculate my computational time, and I didn't call the cudaThreadSynchronize() function.

Does this function call affect the my time measurement? thanks a lot.
Hi,



my time measurement is like that



#include<time.h>



clock_t start, end;



cudaMemcpy( d, h, size, cudaMemcpyHostToDevice );



start = clock();



for ( time = 0 ; time < maxtime, time++ )

{

kernel<<< grid, block >>>(...);

}



end = clock();



cudaMemcpy( h, d, size, cudaMemcpyDeviceToHost );



And then I just that computeTime = end - start to calculate my computational time, and I didn't call the cudaThreadSynchronize() function.



Does this function call affect the my time measurement? thanks a lot.

#4
Posted 05/05/2012 04:44 AM   
[quote name='elguepardo' date='05 May 2012 - 05:44 AM' timestamp='1336193073' post='1404495']
Hi,

my time measurement is like that

[color="#9932CC"]#include[/color]<time.h>

[color="#008000"]clock_t[/color] start, end;

cudaMemcpy( d, h, size, [color="#FF0000"]cudaMemcpyHostToDevice[/color] );

start = clock();

[color="#0000FF"]for[/color] ( time = 0 ; time < maxtime, time++ )
{
kernel<<< grid, block >>>(...);
}

end = clock();

cudaMemcpy( h, d, size, [color="#FF0000"]cudaMemcpyDeviceToHost[/color] );

And then I just that computeTime = end - start to calculate my computational time, and I didn't call the cudaThreadSynchronize() function.

Does this function call affect the my time measurement? thanks a lot.
[/quote]
Hello,

This measurements are not correct, because the control returns to the host when a kernel function is called.

Use instead this code:

[code]


float gputime;
cudaEvent_t start,stop;
cudaEventCreate(&start);
cudaEventCreate(&stop);

// ....


cudaEventRecord(start,0);


// stuff to measure execution time

cudaEventRecord(stop,0);
cudaEventSynchronize(stop);
cudaEventElapsedTime(&gputime,start,stop);

cudaEventDestroy(start);
cudaEventDestroy(stop) ;
printf(" \n");

printf("Time = %g \n", gputime/1000.0f);

printf(" \n");

[/code]
[quote name='elguepardo' date='05 May 2012 - 05:44 AM' timestamp='1336193073' post='1404495']

Hi,



my time measurement is like that



#include<time.h>



clock_t start, end;



cudaMemcpy( d, h, size, cudaMemcpyHostToDevice );



start = clock();



for ( time = 0 ; time < maxtime, time++ )

{

kernel<<< grid, block >>>(...);

}



end = clock();



cudaMemcpy( h, d, size, cudaMemcpyDeviceToHost );



And then I just that computeTime = end - start to calculate my computational time, and I didn't call the cudaThreadSynchronize() function.



Does this function call affect the my time measurement? thanks a lot.



Hello,



This measurements are not correct, because the control returns to the host when a kernel function is called.



Use instead this code:









float gputime;

cudaEvent_t start,stop;

cudaEventCreate(&start);

cudaEventCreate(&stop);



// ....





cudaEventRecord(start,0);





// stuff to measure execution time



cudaEventRecord(stop,0);

cudaEventSynchronize(stop);

cudaEventElapsedTime(&gputime,start,stop);



cudaEventDestroy(start);

cudaEventDestroy(stop) ;

printf(" \n");



printf("Time = %g \n", gputime/1000.0f);



printf(" \n");


#5
Posted 05/05/2012 08:03 AM   
hi, pasoleatis

I used your method to measure my computational time, and now the result is proportional to my time step, thanks a lot for your kind help.
hi, pasoleatis



I used your method to measure my computational time, and now the result is proportional to my time step, thanks a lot for your kind help.

#6
Posted 05/06/2012 04:06 AM   
Scroll To Top