Unspecified launch failure

Hi all,

I have the error “unspecified launch failure” which appears in my program which is a molecular simulation program, when the dimensions of the molecular system simulated, increase.

The point of the program where I have this error changes, but occurs always after the same kernel. This kernel is invoked several times, and each time I execute my program giving it in input a certain number of molecules, I have “unspecified launch failure” after different kernel invocations.

Where can I find a good reference which may help me to solve this kind of problem?

Thank you in advance,

Ardita

Hi all,

I have the error “unspecified launch failure” which appears in my program which is a molecular simulation program, when the dimensions of the molecular system simulated, increase.

The point of the program where I have this error changes, but occurs always after the same kernel. This kernel is invoked several times, and each time I execute my program giving it in input a certain number of molecules, I have “unspecified launch failure” after different kernel invocations.

Where can I find a good reference which may help me to solve this kind of problem?

Thank you in advance,

Ardita

which are your gridDim and blockDim?
Maybe you are reaching the maximum of dimensions of gridDim, blockDim or threads_per_block.
Check it and tell us!

which are your gridDim and blockDim?
Maybe you are reaching the maximum of dimensions of gridDim, blockDim or threads_per_block.
Check it and tell us!

In my case gridDim is 30 x 1 and blockDim is 80 x 1 x 1.

In my case gridDim is 30 x 1 and blockDim is 80 x 1 x 1.

why have you 80 threads per block?
Your threads go distributed in set of 32 threads (warps). 80 threads are 2 warps and 1 halfwarp. I think (from my ignorance) that this is more inefficient than take multiples of 32. (it’s my opinion)
Anyway, can you show the kernel code?
Perhaps be a problem related with any memory access.

Regards!

why have you 80 threads per block?
Your threads go distributed in set of 32 threads (warps). 80 threads are 2 warps and 1 halfwarp. I think (from my ignorance) that this is more inefficient than take multiples of 32. (it’s my opinion)
Anyway, can you show the kernel code?
Perhaps be a problem related with any memory access.

Regards!

In my experience unspecified launch failure is equivalent to a seg fault. Basicly you’re indexing outside one of your arrays. It is not the kernel dimensions that’s the problem, that would give an ‘Invalid configuration argument’ error. However insmvb00 is right that kernels should be launched with thread counts being a multiplum of the warpsize (32) and then extra threads should simply do nothing. Eg: const int id = blockDim.x * blockIdx.x + threadIdx.x; if (id < 80) { do stuff }, in the one dimensional case.

To solve your problem you could try checking that you are using valid indices for your memory lookups and simply don’t do the lookup incase you’re out of bounds. That way you should be able to track down the culprit :)

In my experience unspecified launch failure is equivalent to a seg fault. Basicly you’re indexing outside one of your arrays. It is not the kernel dimensions that’s the problem, that would give an ‘Invalid configuration argument’ error. However insmvb00 is right that kernels should be launched with thread counts being a multiplum of the warpsize (32) and then extra threads should simply do nothing. Eg: const int id = blockDim.x * blockIdx.x + threadIdx.x; if (id < 80) { do stuff }, in the one dimensional case.

To solve your problem you could try checking that you are using valid indices for your memory lookups and simply don’t do the lookup incase you’re out of bounds. That way you should be able to track down the culprit :)