NVidia Driver Crashes Nvidia driver crashes when I execute CUDA program
Hi,

I'm a new CUDA developer and I'm little bit lost. My Nvidia driver crashes when I try to execute my CUDA program. I have two graphic cards (Geforce GTX 560 Ti and GeForce 7300GT). Only GTX card is able to run CUDA, the other card is only to mantain de window's system. Both cards are using the same driver. Probably, the problem is only a bad memory acces from my code, but I can't find de problem because driver craches and I can't print any CUDA Error. Windows stop the execution and doesn't say nothing about this error.
Can you help me? Is it possible to install two different drivers? Can I solve this using NSight?

My Computer:
Intel Core i7 3.4Ghz
Windows 7 64b
Nvidia GeForce GTX 560 Ti
Nvidia GeForce 7300 GT
Visual Studio 2010
CUDA 4.1
--------------

Thanks!! :D
Hi,



I'm a new CUDA developer and I'm little bit lost. My Nvidia driver crashes when I try to execute my CUDA program. I have two graphic cards (Geforce GTX 560 Ti and GeForce 7300GT). Only GTX card is able to run CUDA, the other card is only to mantain de window's system. Both cards are using the same driver. Probably, the problem is only a bad memory acces from my code, but I can't find de problem because driver craches and I can't print any CUDA Error. Windows stop the execution and doesn't say nothing about this error.

Can you help me? Is it possible to install two different drivers? Can I solve this using NSight?



My Computer:

Intel Core i7 3.4Ghz

Windows 7 64b

Nvidia GeForce GTX 560 Ti

Nvidia GeForce 7300 GT

Visual Studio 2010

CUDA 4.1

--------------



Thanks!! :D

#1
Posted 04/23/2012 08:47 AM   
If your problem is due to a bad memory access on the device, it is easily found by running your program under [font="Courier New"]cuda-memcheck[/font].

What does "your driver crashes" mean? I'm not convinced yet there is a problem with the driver. It could be that your host code is trying to dereference a device pointer. In this case, you should be able to find the problem with the debugger you normally use for host code.
If your problem is due to a bad memory access on the device, it is easily found by running your program under cuda-memcheck.



What does "your driver crashes" mean? I'm not convinced yet there is a problem with the driver. It could be that your host code is trying to dereference a device pointer. In this case, you should be able to find the problem with the debugger you normally use for host code.

Always check return codes of CUDA calls for errors. Do not use __syncthreads() in conditional code unless the condition is guaranteed to evaluate identically for all threads of each block. Run your program under cuda-memcheck to detect stray memory accesses. If your kernel dies for larger problem sizes, it might exceed the runtime limit and trigger the watchdog timer.

#2
Posted 04/23/2012 09:38 AM   
What driver version are you running? I am not very knowledgable about Windows drivers, but it seems like the following is what you would want:

http://www.nvidia.com/object/win7-winvista-64bit-296.10-whql-driver.html

What are the symptoms of the crash? Does the screen freeze indefinitely? Is there screen corruption? Long-running CUDA kernels can freeze the screen for several seconds as graphics cann't be update while CUDA is running. If a kernel is killed by the watchdog timer because it runs too long, there sometimes is an additional delay of several seconds before the driver recovers (at least I have seen this on some Windows systems in the past). Do all CUDA programs crash, or are you able to run simple SDK sample apps succesfully?
What driver version are you running? I am not very knowledgable about Windows drivers, but it seems like the following is what you would want:



http://www.nvidia.com/object/win7-winvista-64bit-296.10-whql-driver.html



What are the symptoms of the crash? Does the screen freeze indefinitely? Is there screen corruption? Long-running CUDA kernels can freeze the screen for several seconds as graphics cann't be update while CUDA is running. If a kernel is killed by the watchdog timer because it runs too long, there sometimes is an additional delay of several seconds before the driver recovers (at least I have seen this on some Windows systems in the past). Do all CUDA programs crash, or are you able to run simple SDK sample apps succesfully?

#3
Posted 04/23/2012 09:48 AM   
Hi!

Yesterday I installed Nsight to try to found the problem. Obviously, I'm a rookie, but I think that I installed successfully. One of steps to install Nsight is disable Windows TDR (Timout Detection and Recovery). The description of TDR is exactly my problem: If the operating system does not receive a response from a graphics card within a certain amount of time (default is 2 seconds), the operating system resets the graphics card. If TDR is enabled and you see the TDR error message "Display driver stopped responding and has recovered," this means that the Windows operating system reset the display driver.

With TDR disabled, my program runs perfectly. :D

Now, my driver version is the version that Nsight needs to run (following instruction from Nsight's download page): 286.16 for 64bits. Do you recommend me the other driver? (296.1?).

Thanks for all! :D
Hi!



Yesterday I installed Nsight to try to found the problem. Obviously, I'm a rookie, but I think that I installed successfully. One of steps to install Nsight is disable Windows TDR (Timout Detection and Recovery). The description of TDR is exactly my problem: If the operating system does not receive a response from a graphics card within a certain amount of time (default is 2 seconds), the operating system resets the graphics card. If TDR is enabled and you see the TDR error message "Display driver stopped responding and has recovered," this means that the Windows operating system reset the display driver.



With TDR disabled, my program runs perfectly. :D



Now, my driver version is the version that Nsight needs to run (following instruction from Nsight's download page): 286.16 for 64bits. Do you recommend me the other driver? (296.1?).



Thanks for all! :D

#4
Posted 04/24/2012 08:16 AM   
Scroll To Top