Hello , in my code I am using thrust and cublas functions in order to compute inclusive_scan and transpose.
Checking for race conditions
cuda-memcheck --tool racecheck --racecheck-report analysis ./mycode
it gives me a lot of:
Race reported between Read access at 0x0000e568 in /usr/local/cuda/bin/..//include/thrust/system/cuda/detail/detail/fast_scan.inl:349:void thrust::system::cuda::detail::detail::launch_closure_by_value<thrust::system::cuda::detail::detail::fast_scan::fast_scan_detail::downsweep_intervals_closure<bool=1, float*, float*, float, thrust::plus<float>, thrust::system::detail::internal::uniform_decomposition<unsigned int>, thrust::system::cuda::detail::detail::statically_blocked_thread_array<unsigned int=224>>>(bool=1)
========= and Write access at 0x0000ebb8 in /usr/local/cuda/bin/..//include/thrust/system/cuda/detail/detail/fast_scan.inl:361:void thrust::system::cuda::detail::detail::launch_closure_by_value<thrust::system::cuda::detail::detail::fast_scan::fast_scan_detail::downsweep_intervals_closure<bool=1, float*, float*, float, thrust::plus<float>, thrust::system::detail::internal::uniform_decomposition<unsigned int>, thrust::system::cuda::detail::detail::statically_blocked_thread_array<unsigned int=224>>>(bool=1) [892 hazards]
......
Race reported between Read access at 0x0000ed50 in /usr/local/cuda/bin/..//include/thrust/system/cuda/detail/detail/fast_scan.inl:349:void thrust::system::cuda::detail::detail::launch_closure_by_value<thrust::system::cuda::detail::detail::fast_scan::fast_scan_detail::downsweep_intervals_closure<bool=1, thrust::device_ptr<float>, thrust::device_ptr<float>, float, thrust::plus<float>, thrust::system::detail::internal::uniform_decomposition<unsigned int>, thrust::system::cuda::detail::detail::statically_blocked_thread_array<unsigned int=224>>>(bool=1)
========= and Write access at 0x0000f3a0 in /usr/local/cuda/bin/..//include/thrust/system/cuda/detail/detail/fast_scan.inl:361:void thrust::system::cuda::detail::detail::launch_closure_by_value<thrust::system::cuda::detail::detail::fast_scan::fast_scan_detail::downsweep_intervals_closure<bool=1, thrust::device_ptr<float>, thrust::device_ptr<float>, float, thrust::plus<float>, thrust::system::detail::internal::uniform_decomposition<unsigned int>, thrust::system::cuda::detail::detail::statically_blocked_thread_array<unsigned int=224>>>(bool=1) [4460 hazards]
.....
......
Race reported between Write access at 0x000003f8 in void transpose_readWrite_alignment_kernel<float, int=1, bool=0, int=6, int=5, int=3>(cublasTransposeParams<float>, float const *, float*, float const *)
========= and Write access at 0x00000320 in void transpose_readWrite_alignment_kernel<float, int=1, bool=0, int=6, int=5, int=3>(cublasTransposeParams<float>, float const *, float*, float const *) [384 hazards]
========= and Write access at 0x00000620 in void transpose_readWrite_alignment_kernel<float, int=1, bool=0, int=6, int=5, int=3>(cublasTransposeParams<float>, float const *, float*, float const *) [47 hazards]
========= and Write access at 0x000003f8 in void transpose_readWrite_alignment_kernel<float, int=1, bool=0, int=6, int=5, int=3>(cublasTransposeParams<float>, float const *, float*, float const *) [2952 hazards]
========= and Write access at 0x000004a0 in void transpose_readWrite_alignment_kernel<float, int=1, bool=0, int=6, int=5, int=3>(cublasTransposeParams<float>, float const *, float*, float const *) [1729 hazards]
......
......
Race reported between Write access at 0x00000320 in void transpose_readWrite_alignment_kernel<float, int=1, bool=0, int=6, int=5, int=3>(cublasTransposeParams<float>, float const *, float*, float const *)
========= and Write access at 0x000003f8 in void transpose_readWrite_alignment_kernel<float, int=1, bool=0, int=6, int=5, int=3>(cublasTransposeParams<float>, float const *, float*, float const *) [384 hazards]
=========
.....
.....
Do I have to take these into consideration or not?
Also , in thrust if you can check it shows:
include/thrust/system/cuda/detail/detail/fast_scan.inl:349:
/include/thrust/system/cuda/detail/detail/fast_scan.inl:361
I found these :
https://github.com/cbuchner1/ccminer/blob/master/compat/thrust/system/cuda/detail/detail/fast_scan.inl
http://code.google.com/p/thrust/source/browse/trunk/thrust/detail/device/cuda/detail/fast_scan.inl?repo=wiki&r=cc9f8a792432fe53e994524ebc9b46554d4da176
but I am not sure what to do.
Thanks