cudaMemcpyFromSymbol not getting the correct value back
Hi guys. I've got stucked here for a couple of days in this error. I have a Mac with a 320m, and I'm doing the following code:
[code]
__device__ float d_endAgents;
float h_endAgents;

void callingKernel(dim3 dimBlock dim3 dimThread){
h_endAgents = -1.0;
cudaMemset(&d_endAgents,-1.0,sizeof(float));


kernel<<< dimBlock, dimThread >>> ();

cudaThreadSynchronize());

cudaMemcpyFromSymbol(&h_endAgents, "d_endAgents", sizeof(float),0, cudaMemcpyDeviceToHost);
printf("\n[%f]", h_endAgents);
}

__global__ void kernel(){
d_endAgents = -100.0;
}

[/code]

This code returns to me that the value of d_endAgents is 0. I've tried many different ways to implement it, but all had the same anomaly. What can I do to return the correct value?
Sorry about any grammar mistakes
Hi guys. I've got stucked here for a couple of days in this error. I have a Mac with a 320m, and I'm doing the following code:



__device__ float d_endAgents;

float h_endAgents;



void callingKernel(dim3 dimBlock dim3 dimThread){

h_endAgents = -1.0;

cudaMemset(&d_endAgents,-1.0,sizeof(float));





kernel<<< dimBlock, dimThread >>> ();



cudaThreadSynchronize());



cudaMemcpyFromSymbol(&h_endAgents, "d_endAgents", sizeof(float),0, cudaMemcpyDeviceToHost);

printf("\n[%f]", h_endAgents);

}



__global__ void kernel(){

d_endAgents = -100.0;

}






This code returns to me that the value of d_endAgents is 0. I've tried many different ways to implement it, but all had the same anomaly. What can I do to return the correct value?

Sorry about any grammar mistakes

#1
Posted 01/04/2012 07:29 PM   
You can't use the [font="Courier New"]&[/font] operator in host code to obtain device addresses. You need to call [url="http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/html/group__CUDART__MEMORY_g2db2376f8fb4203df2fa9e104e16978e.html#g2db2376f8fb4203df2fa9e104e16978e"][font="Courier New"]cudaGetSymbolAddress()[/font][/url] instead to obtain the address.

It's just the same as you cannot write
[code]cudaMemcpy(&h_endAgents, &d_endAgents, sizeof(float),cudaMemcpyDeviceToHost);[/code]but need to use
[code]cudaMemcpyFromSymbol(&h_endAgents, "d_endAgents", sizeof(float),0, cudaMemcpyDeviceToHost);[/code]instead.
You can't use the & operator in host code to obtain device addresses. You need to call cudaGetSymbolAddress() instead to obtain the address.



It's just the same as you cannot write

cudaMemcpy(&h_endAgents, &d_endAgents, sizeof(float),cudaMemcpyDeviceToHost);
but need to use

cudaMemcpyFromSymbol(&h_endAgents, "d_endAgents", sizeof(float),0, cudaMemcpyDeviceToHost);
instead.

Always check return codes of CUDA calls for errors. Do not use __syncthreads() in conditional code unless the condition is guaranteed to evaluate identically for all threads of each block. Run your program under cuda-memcheck to detect stray memory accesses. If your kernel dies for larger problem sizes, it might exceed the runtime limit and trigger the watchdog timer.

#2
Posted 01/04/2012 07:49 PM   
[quote name='tera' date='04 January 2012 - 04:49 PM' timestamp='1325706598' post='1351551']
You can't use the [font="Courier New"]&[/font] operator in host code to obtain device addresses. You need to call [url="http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/html/group__CUDART__MEMORY_g2db2376f8fb4203df2fa9e104e16978e.html#g2db2376f8fb4203df2fa9e104e16978e"][font="Courier New"]cudaGetSymbolAddress()[/font][/url] instead to obtain the address.

It's just the same as you cannot write
[code]cudaMemcpy(&h_endAgents, &d_endAgents, sizeof(float),cudaMemcpyDeviceToHost);[/code]but need to use
[code]cudaMemcpyFromSymbol(&h_endAgents, "d_endAgents", sizeof(float),0, cudaMemcpyDeviceToHost);[/code]instead.
[/quote]

Doing some modifications here since yesterday I've found the problem. When I was using the dim3 var to specify the number of threads and blocks for some reason the kernel wasn't lauching well. I've changed it to <<< 1, 16 >>> and it worked somehow. I'm investigating it. Thanks Tera
[quote name='tera' date='04 January 2012 - 04:49 PM' timestamp='1325706598' post='1351551']

You can't use the & operator in host code to obtain device addresses. You need to call cudaGetSymbolAddress() instead to obtain the address.



It's just the same as you cannot write

cudaMemcpy(&h_endAgents, &d_endAgents, sizeof(float),cudaMemcpyDeviceToHost);
but need to use

cudaMemcpyFromSymbol(&h_endAgents, "d_endAgents", sizeof(float),0, cudaMemcpyDeviceToHost);
instead.





Doing some modifications here since yesterday I've found the problem. When I was using the dim3 var to specify the number of threads and blocks for some reason the kernel wasn't lauching well. I've changed it to <<< 1, 16 >>> and it worked somehow. I'm investigating it. Thanks Tera

#3
Posted 01/05/2012 03:30 PM   
Scroll To Top