Simple OpenCL program compiles and runs, gives incorrect output

I wrote a simple OpenCL program based off the SDK and it compiles and runs, however the output is wrong. Is there something I’m doing wrong?

Any suggestions for learning to debug C and OpenCL is much appreciated. I’m quite new to the platform.

Code is attached.

Thanks.
TEST_OPENCL.zip (2.16 KB)

Hi,

the best things to do it’s to look all GPU memory with ReadBuffer to check the data

If you have very complicated kernel you can make on CPU to see if there is some problem

// create buffers on device

	cl_mem vol_a = clCreateBuffer(gpu_context, CL_MEM_WRITE_ONLY, mem_size, &a, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_b = clCreateBuffer(gpu_context, CL_MEM_WRITE_ONLY, mem_size, &b, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_c = clCreateBuffer(gpu_context, CL_MEM_READ_ONLY, mem_size, &c, &err);

	shrCheckError(err, CL_SUCCESS);

problem with the declaration

You make c = a + b

So c = CL_MEM_WRITE_ONLY and a et b are CL_MEM_READ_ONLY and not the inverse ;)

Moreover, after you make a WriteBuffer it’s more simple to make :

cl_mem vol_a = clCreateBuffer(gpu_context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, mem_size, &a, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_b = clCreateBuffer(gpu_context, CL_MEM_READ_ONLY |  CL_MEM_COPY_HOST_PTR, mem_size, &b, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_c = clCreateBuffer(gpu_context, CL_MEM_WRITE_ONLY |  CL_MEM_COPY_HOST_PTR, mem_size, &c, &err);

	shrCheckError(err, CL_SUCCESS);

Last thing, why you put the adress of a, b and c (&a, &b and &c) instead of a, b and c ?

this code runs:

cl_mem vol_a = clCreateBuffer(gpu_context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, mem_size, a, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_b = clCreateBuffer(gpu_context, CL_MEM_READ_ONLY |  CL_MEM_COPY_HOST_PTR, mem_size, b, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_c = clCreateBuffer(gpu_context, CL_MEM_WRITE_ONLY |  CL_MEM_COPY_HOST_PTR, mem_size, c, &err);

	shrCheckError(err, CL_SUCCESS);

same thing for the readbuffer

Thanks

J

I made the suggested changes and after copying array a and b to the device, running the kernel and copying back to new arrays d and e, I was able to establish the memory was correctly copied. However, array c is still 0. I’m thinking that the kernel did not run for some reason. Or if it did, not correctly. Any suggestions?

Code is attached.

And thanks for the corrections and pointers.
TEST_OPENCL.zip (2.17 KB)

change

// create buffers on device

	cl_mem vol_a = clCreateBuffer(gpu_context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, mem_size, a, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_b = clCreateBuffer(gpu_context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, mem_size, b, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_c = clCreateBuffer(gpu_context, CL_MEM_WRITE_ONLY, mem_size, c, &err);

	shrCheckError(err, CL_SUCCESS);

	// copy data from host to device

	err = clEnqueueWriteBuffer(cmd_queue, vol_a, CL_TRUE, 0, mem_size, a, 0, NULL, NULL);

	err |= clEnqueueWriteBuffer(cmd_queue, vol_b, CL_TRUE, 0, mem_size, b, 0, NULL, NULL);

	shrCheckError(err, CL_SUCCESS);

to

// create buffers on device

	cl_mem vol_a = clCreateBuffer(gpu_context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, mem_size, a, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_b = clCreateBuffer(gpu_context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, mem_size, b, &err);

	shrCheckError(err, CL_SUCCESS);

	cl_mem vol_c = clCreateBuffer(gpu_context, CL_MEM_WRITE_ONLY | CL_MEM_COPY_HOST_PTR, mem_size, NULL, &err);

	shrCheckError(err, CL_SUCCESS);

Change your kernel

to

__kernel void add_array(__global int *a, __global int  *b, __global int *c)

{

	int xid = get_global_id(0);

	c[xid] = a[xid] + b[xid];

}

because you work with int and not with float

I’m an idiot. I had been staring at that code for so long that I didn’t realize the float/int issue.

Although it compiled with “NULL” instead of “c”, the program errored out. Changing it to “c” made it work.

[codebox] cl_mem vol_c = clCreateBuffer(gpu_context, CL_MEM_WRITE_ONLY | CL_MEM_COPY_HOST_PTR, mem_size, c, &err);

shrCheckError(err, CL_SUCCESS);[/codebox]

Thanks for all the help!

it’s

cl_mem vol_c = clCreateBuffer(gpu_context, CL_MEM_WRITE_ONLY, mem_size, NULL, &err)

Don’t forget to erase CL_MEM_COPY_HOST_PTR if you make a null pointer for the CPU

I think you should read the openCL guide because you don’t understand what you make