Always check return codes of CUDA calls for errors. Do not use __syncthreads() in conditional code unless the condition is guaranteed to evaluate identically for all threads of each block. Run your program under cuda-memcheck to detect stray memory accesses. If your kernel dies for larger problem sizes, it might exceed the runtime limit and trigger the watchdog timer.
%.o : %.cu
$(NVCC) $(CUFLAGS) -c -o $@ $<
You must log in to send a PM.
Please Login | Register to add a comment.
Not a member? Register Now