Help required in Invalid device ordinal

Hello,

I have installed cuda 4.0 in my system and tried to run the device query program given in the cuda sdk. I get the following error

[deviceQuery] starting…
./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 10
→ invalid device ordinal
[deviceQuery] test results…
FAILED

Press ENTER to exit…

Can some one help me on this …!!

Try running the executable with super user privileges. Not sudo $CUDA_SDK_PATH/(…)/executable but inputing sudo -i first and then $CUDA_SDK_PATH/(…)/executable.

Is the error still there?

Don’t forget to export the proper LD_LIBRARY_PATH to the root user.

Hello,

Thanks for the reply, I still have the error when I try to run it as super user privileges … One more issue is that, My X server does not start at all … So, I tried downgrading my ubuntu to 10.04 and installed cuda 3.2 … ! Still I’m not able to run the deviceQuery …

I get this output …

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.

FAILED

Press to Quit…


Can you please help me on this …!

I think you should download new version of SDK (4.0) here and replace your old SDK 3: http://developer.nvidia.com/cuda-toolkit-40

I ran into the same problem lately (installed the latest cuda on a Fedora 13 64 bit installation) and after a lot of research on various forums etc I can only surmise that this is caused because the native Nouveau drivers on Fedora get loaded on boot (instead of nvidia even though they’ve installed properly). I got my cuda installation to work by blacklisting the nouveau drivers before installing the nvidia dev drivers.

The first post in this thread HERE gives a detailed step by step instruction of blacklisting and manually installing the nvidia drivers - in my case the driver i downloaded at step 1 was the CUDA dev drivers from the cuda release page. Once the nvidia drivers are loaded, install the cuda tools + GPU computing SDK and compile the examples and you should be good to go.

Hope this helps anyone running into a similar problem.

We ran into this problem just now. Our problem turned out to be that we had installed our Tesla using one 6-pin power cable, instead of two 6-pin cables or one 8-pin cable. Once we used the correct power connector, it worked fine.

Just guessing, but perhaps this error code means that the driver found the GPU in the system, but cannot properly communicate with it?

Helo.

We recently ran into a similar problem.

The box in question is a Dell dual socket LGA1366 box, 98GB RAM, a Quadro 600 graphics card and a TESLA C2075. Operating system is Suse Linux 12.1. The nVidia driver is 295.20/64Bit. The box ran pretty well these days, but after a couple of small updates of the OS, several minor installations of X11 software (the box is “headless” and does not run a X11 display system!), we rebooted the workstation and do not have access to the TESLA anymore.

Well, here is what I did to find the problem. The open source Nouveau driver/module isn’t active. The nvidia kernel module is present. lspci lists these lines reagrding nVidia stuff:

03:00.0 VGA compatible controller: NVIDIA Corporation GF108 [Quadro 600] (rev a1)

03:00.1 Audio device: NVIDIA Corporation GF108 High Definition Audio Controller (rev a1)

04:00.0 VGA compatible controller: NVIDIA Corporation Tesla C2075 (rev a1)

This means the workstation’s BIOS and the Linux Suse 12.1 kernel obviously “see” something like a TESLA unit.

As I reported, the driver for the TESLA C2075 and Quadro 600 is the most recent I could grap:

nVidia 295.20

Installation went smooth. The kernelmodule is also loaded, as modprobe -l|grep nvidia reports:

kernel/drivers/video/nvidia/nvidiafb.ko

kernel/drivers/net/ethernet/nvidia/forcedeth.ko

kernel/drivers/video/nvidia.ko

Checking the CUDA capable devices is performed via nVidia software:

root: /usr/local/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery

/usr/local/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 10

→ invalid device ordinal

[deviceQuery] test results…

FAILED

exiting in 3 seconds: 3…2…1…done!

root:

There is only one(!) device node created in /dev: /dev/nvidia0. It should be two of them!

What causes this failure?