Hello,
I try to execute code on linux and on windows.Here us the code:
template <typename Tin, typename TOut>
__global__ void SepColkernelRGB(Tin *ptImIn,TOut *ptImROut,TOut *ptImGOut,TOut *ptImBOut,
int w, int h,double *ptKernel,int ikernelSizeX)
{
extern __shared__ double LocalS[];
}
Call of the function:
dim3 blocks(ceil((float)iwidth_ / ( BLOCK_SIZE_X)), ceil((float)iheight_ / BLOCK_SIZE_Y));
dim3 threads(BLOCK_SIZE_X, BLOCK_SIZE_Y);
int iKernelSize = KGaussianX_G1->iwidth_;
int isharedMemSize = ((BLOCK_SIZE_X +iKernelSize)*BLOCK_SIZE_Y*3 + iKernelSize)*sizeof(double);
std::cout << "iKernelSize " << iKernelSize << " isharedMemSize " << isharedMemSize << std::endl;
std::cout << "blocks " << blocks.x << " " << blocks .y << " threads " << threads.x << " " << threads.y << std::endl;
SepColkernelRGB<<<blocks,threads,isharedMemSize>>>(ptSrc_Device,f_ptImageTmp1_Device_,f_ptImageTmp2_Device_,f_ptImageTmp3_Device_,
iwidth_,iheight_,KGaussianX_G1->fKernelTab_Device_,iKernelSize);
On linux the program works properly. On Windows, as soon as iKernelSize is bigger than 31 I get that error:
During the launch of the program :
CUDA Runtime API error 11: invalid argument.
When I use cuda-memcheck :
CUDA Runtime API error 9: invalid configuration argument.
========= Program hit cudaErrorInvalidConfiguration (error 9) due to "invalid configuration argument" on CUDA API call to cudaLaunch.
I think that problem comes from the memory share size. Is there more shared memory size in Linux than in Windows?
Then I try to look at the deviceQuery output. Both are the same output except for the ECC:
Device has ECC support: Disabled(Windows) / Yes (linux)
Can someone help me with that?