cuda-memcheck errors yesterday no, today yes
Hi guys, I have a strange case to propose to you... yesterday, my program run under cuda-memcheck gave me no errors, but today, the same program, only recompiled without any modifications and always run under cuda-memcheck, raise me an unspecified launch failure... if I run the program without cuda-memcheck it gives no problem and work correctly... why this happening???
Thanks all, A.
Hi guys, I have a strange case to propose to you... yesterday, my program run under cuda-memcheck gave me no errors, but today, the same program, only recompiled without any modifications and always run under cuda-memcheck, raise me an unspecified launch failure... if I run the program without cuda-memcheck it gives no problem and work correctly... why this happening???

Thanks all, A.

#1
Posted 04/20/2012 07:33 AM   
Was there any driver update done? I n some cases after updates the computer needs to be restarted.
Was there any driver update done? I n some cases after updates the computer needs to be restarted.

#2
Posted 04/20/2012 10:39 AM   
[quote name='pasoleatis' date='20 April 2012 - 12:39 PM' timestamp='1334918349' post='1398657']
Was there any driver update done? I n some cases after updates the computer needs to be restarted.
[/quote]
no, I didn't do anything between 2 runs... and no one else tells me that he makes some changes...
[quote name='pasoleatis' date='20 April 2012 - 12:39 PM' timestamp='1334918349' post='1398657']

Was there any driver update done? I n some cases after updates the computer needs to be restarted.



no, I didn't do anything between 2 runs... and no one else tells me that he makes some changes...

#3
Posted 04/26/2012 07:15 AM   
Can you post the application where you see this ? Also what version of the toolkit and driver are you using ? Finally, what GPU is this running on ?
Can you post the application where you see this ? Also what version of the toolkit and driver are you using ? Finally, what GPU is this running on ?

#4
Posted 04/26/2012 04:46 PM   
[quote name='vyas' date='26 April 2012 - 06:46 PM' timestamp='1335458774' post='1401264']
Can you post the application where you see this ? Also what version of the toolkit and driver are you using ? Finally, what GPU is this running on ?
[/quote]
gpu is a 550Ti ant toolkit version was the 4.1, but now i upgrade it to the 4.2... indeed, there was a bad memory access, and now i fixed it and the error was no longer displayed... i'm sorry, but i don't remember what I precisely chenged to fix the error, so the max i can do is to post the correct code, but i don't think this will help...
[quote name='vyas' date='26 April 2012 - 06:46 PM' timestamp='1335458774' post='1401264']

Can you post the application where you see this ? Also what version of the toolkit and driver are you using ? Finally, what GPU is this running on ?



gpu is a 550Ti ant toolkit version was the 4.1, but now i upgrade it to the 4.2... indeed, there was a bad memory access, and now i fixed it and the error was no longer displayed... i'm sorry, but i don't remember what I precisely chenged to fix the error, so the max i can do is to post the correct code, but i don't think this will help...

#5
Posted 04/27/2012 07:08 AM   
This kind of thing happens to me sometimes too. I think some change, not on the kernel, forces a recompilation of the kernel, and perhaps the optimization algorithm decides to put things differently, and forces some hidden bugs to appear. This isn't bad, as it forces the code to be bug free, but it is quite frustrating, having a running code, then suddenly things break.

I was just in such a situation, I was doing some memory access outside the allocated, yet it went unnoticed for all my inputs, today, at one point, when I recompiled, things stopped working for some inputs (larger inputs), I can't really explain what the problem was, but fixing that access issue made things roll again.

I have also noticed register usage change, a difference of as large as 3-4 registers! I do not know how the CUDA compilre works, but if it has some randomized optimizer underneath, that would explain the recompile->different generated code phenomenon.
This kind of thing happens to me sometimes too. I think some change, not on the kernel, forces a recompilation of the kernel, and perhaps the optimization algorithm decides to put things differently, and forces some hidden bugs to appear. This isn't bad, as it forces the code to be bug free, but it is quite frustrating, having a running code, then suddenly things break.



I was just in such a situation, I was doing some memory access outside the allocated, yet it went unnoticed for all my inputs, today, at one point, when I recompiled, things stopped working for some inputs (larger inputs), I can't really explain what the problem was, but fixing that access issue made things roll again.



I have also noticed register usage change, a difference of as large as 3-4 registers! I do not know how the CUDA compilre works, but if it has some randomized optimizer underneath, that would explain the recompile->different generated code phenomenon.

#6
Posted 04/27/2012 05:15 PM   
[quote name='Gorune' date='27 April 2012 - 07:15 PM' timestamp='1335546931' post='1401646']
This kind of thing happens to me sometimes too. I think some change, not on the kernel, forces a recompilation of the kernel, and perhaps the optimization algorithm decides to put things differently, and forces some hidden bugs to appear. This isn't bad, as it forces the code to be bug free, but it is quite frustrating, having a running code, then suddenly things break.

I was just in such a situation, I was doing some memory access outside the allocated, yet it went unnoticed for all my inputs, today, at one point, when I recompiled, things stopped working for some inputs (larger inputs), I can't really explain what the problem was, but fixing that access issue made things roll again.

I have also noticed register usage change, a difference of as large as 3-4 registers! I do not know how the CUDA compilre works, but if it has some randomized optimizer underneath, that would explain the recompile->different generated code phenomenon.
[/quote]
oh, well, at least I'm not the only person that those errors come out XD i see a different usage of registers too, now I think about... hope nvidia staff know this problem and will fix it, if it isn't fixed with 4.2 release yet... it's not so bad, but as you say it is quite frustrating...
[quote name='Gorune' date='27 April 2012 - 07:15 PM' timestamp='1335546931' post='1401646']

This kind of thing happens to me sometimes too. I think some change, not on the kernel, forces a recompilation of the kernel, and perhaps the optimization algorithm decides to put things differently, and forces some hidden bugs to appear. This isn't bad, as it forces the code to be bug free, but it is quite frustrating, having a running code, then suddenly things break.



I was just in such a situation, I was doing some memory access outside the allocated, yet it went unnoticed for all my inputs, today, at one point, when I recompiled, things stopped working for some inputs (larger inputs), I can't really explain what the problem was, but fixing that access issue made things roll again.



I have also noticed register usage change, a difference of as large as 3-4 registers! I do not know how the CUDA compilre works, but if it has some randomized optimizer underneath, that would explain the recompile->different generated code phenomenon.



oh, well, at least I'm not the only person that those errors come out XD i see a different usage of registers too, now I think about... hope nvidia staff know this problem and will fix it, if it isn't fixed with 4.2 release yet... it's not so bad, but as you say it is quite frustrating...

#7
Posted 05/02/2012 10:09 AM   
Scroll To Top