Hi,
So I wanted to try something and pavanky from the arrayfire team posted a solution for me here:
Now, he is probably quite busy, but I’m getting this error based on line 20:
extern shared char *s_mem;
T *s_res = (T *)s_mem;
However, I read online and could not find/identify a good reason why I would get such error. Most of the time you get it when you don’t have ‘extern’ keyword.
Any ideas?
Try changing that line to:
extern __shared__ char s_mem[];
as that is the expected syntax for dynamically allocated shared memory usage in a kernel:
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#shared
and be sure to allocate the amount of shared memory in bytes that the kernel will use in the kernel launch syntax as the third parameter:
kernel_name<<<blocks, threads_per_block, shared_mem_size_in_bytes...>>>(...);