Problem with parameter

Hello,

I have been trying to make an application with cuda, and I kinda hit a brick wall at the moment. I am trying to make an array of ints of size 709 5129. Which it does on the host correctly. Then I use cudaMemcpy to get acces to that on the device, which returns with no error. But when I am debugging the array is not fully filled, however it was on the host fully filled. What am I missing. the file that I am talking about is kernel.cu and the array is ccopy on the following link:Magic-squares/kernel.cu at master · stefanberg96/Magic-squares · GitHub
Any suggestions are appreciated.

I think your claim is probably not correct (that the device array does not match the host array, at the beginning of the kernel). If you still want to maintain that, substitute your kernel with one that simply checks the array values (or prints them out, if you wish) and point out the discrepancy. Alternatively, before launching your kernel, copy the dev_c array back to another host array, (e.g. temp_c) and check value-by-value against ccopy. I did that, and of course there is no discrepancy.

Your kernel has numerous indexing problems (out-of-bounds memory accesses) which you will discover if you run your code with cuda-memcheck. I would start by eliminating those. You can use a method such as the one described here:

[url]cuda - Unspecified launch failure on Memcpy - Stack Overflow

to localize your out-of-bounds accesses to individual lines of device code, and then if necessary you can add selective printf statements to your kernel/device code to get more information.

Thanks for the quick response I will try it as soon as I can