Problems with CUDA 9.1 in Ubuntu 16.04

Dear all

  I try to install CUDA 9.1 in Ubuntu 16.04. The driver 387 do not installs and I put the last drive available to Ubuntu 384.111 (perhaps here the error). (Must I try to install 387 in text mode?)

I used a workspace from other computer with other compute capability (cc). When I configure the workspace in the new computer it imposes the cc 3.0 (that the computer do not have) and the native (cc6.1). Even when I try to cancel cc 3.0 I do not fulfill.

In Tools Settings, NVCC Linker, All Options

–cudart static -L/usr/local/cuda-9.1/lib64 -L/usr/lib/gcc/x86_64-linux-gnu/5/ -L/usr/lib --relocatable-device-code=true -gencode arch=compute_30,code=compute_30 -gencode arch=compute_61,code=compute_61 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_61,code=sm_61 -m64


When I run the compiled program with both cc it gave the error “cudaSetDevice failed! Do you have a CUDA-capable GPU installed?”

When I try to create a new project gives the following error in the figure

[url]http://luisgo.dyndns.org/nv/cuda.png[/url]

But it continues creating the project but with the same consequences as before

Can someone give some help?

Thanks

Luis Gonçalves

a 384.xx driver won’t work with CUDA 9.1

You must install a 387.xx driver.

To do a proper CUDA install, get your installers from here: [url]http://www.nvidia.com/getcuda[/url] (or for older versions, google “cuda toolkit archive”)

and follow the instructions in the appropriate install guide.

Hello…

I have spent the past four days virtually full time trying to get my new NVIDIA GEFORCE GTX 1080 ti to work with CUDA 9.1.85 on Ubuntu 16.04. I can get the Ubuntu GUI to work with 384.111 but CUDA does not work, as has been pointed out.

If I switch to text mode and and go through the entire official installation process given on the nvidia site, I can get 387.26 installed but then it is impossible to get back into the GUI mode: The login screen goes into a loop mode wherein it keeps asking for a password. I have repeated this several times using the Deb Package method. The runfile method tells me that “the driver was not installed” so that is equally useless.

Any suggestions would be appreciated — I need to be able to run in GPU mode for a machine learning course I am taking and I’ve drained Google of all useful information

Any suggestions would be welcomed.

On your system, what is the output of:

lspci |grep -i VGA

?

For your GUI display, is the monitor hooked up to the GTX 1080 ti or is it hooked to some other port?

Thanks for the quick reply, txbob…

I just managed to get it working by getting into text mode, doing <sudo opt-get purge nvidia-*> and . Rebooted and … voila!

The previous time(s) I did this the driver that was installed was 387.26 — the working one that installed just now is 387.34. I don’t know how significant that is but it works now. And the deviceQuery output looks OK.

A moot point now but lspci | grep -i VGA output is now:

0.1:00 VGA compatible controller: NVIDIA Corp. GP102 [GeForce GTX 1080 Ti] (rev a1)

I have two displays connected to the 1080 Ti outputs.

Thanks a heap for your quick response!!!

Rick

I’m having the same issue and googled thoroughly but have not found solutions aside from this page.

I installed Nvidia Cuda Toolkit v9.1 following the instructions on the Nvidia site, onto my machine (ubuntu 16.04 with GeForce GTX 1080 Ti - I have no other display ports on my motherboard), then on reboot I get stuck in this loop where it accepts my password and then crashes. Then reloads the password screen.

Rick can you help clarify what you did exactly to get things working?

(1) Boot into text mode (I’m assuming this is a command in Grub?)

(2)
run:
sudo service lightdm stop
sudo opt-get purge nvidia-*
sudo opt-get install nvidia-387

Is that it? Many Thanks :)

Cheers,
Patrick

Rick I made a video: Cuda Toolkit 9.1 login issue Ubuntu 16.04 - YouTube

Is this the same behavior you witnessed?

Hi Patrick,

Yes, the same as shown in your video…

Out of desperation, I made the mistake of trying the 340 driver, thinking that perhaps that would work. Don’t even think of doing that! It made the machine impossible to boot up, providing a plethora of cryptic “software lockup” indications. So I took the 1080 Ti card out and replaced it with my previous card, an NVIDIA Twin FrozrIII. That at least made it possible to log in.

Once logged in, I logged out and did a Ctrl-Alt-F1 to get into terminal mode. Then the system asked me for my username and password and I found myself in command-line mode.

To stop the GUI still running in the background (I think it still does) I did the sudo service lightdm stop

Then sudo opt-get purge nvidia-* presumably got rid of all the nvidia driver supporting files (there’s a ton of them!)

The proper driver was installed with sudo opt-get install nvidia-387

For whatever reason, the driver that was being installed in my many previous attempts was 387.26 and that one did not work. Somehow, the one installed in the last attempt was 387.34. Its exact nomenclature in Ubuntu 16.04 LTS System Settings → Software & Updates → Additional Drivers is “Using NVIDIA binary driver - version 387.34 from nvidia-387 (open source)”

Luckily it worked.

Best of luck!

Rick

interesting - when I tried installing with apt-get it still chooses driver 387.26 and not 387.34. This is before I rebooted. After I rebooted, same issue with the unending login screen and I can’t get into terminal mode. Also I don’t have an older GPU that I could use :(.

Going to do a clean install of ubuntu and this time choose Cuda Toolkit 9.0 instead of 9.1.

Something is wrong with 9.1…

** Update ** I actually got into terminal mode (tried one more time). And then I followed Rick’s steps. Unfortunately it landed me with 387.26 again and not the higher version. So I have the same issue. Now will start over and try 9.0

Patrick,

You might try getting 387.34 from here

and installing it before going through all the pain of re-installing Ubuntu.

Rick

Thanks Rick. I already reinstalled Ubuntu (I just built this computer so it wasn’t that painful to reinstall, didn’t really lose anything).

Interestingly, I searched on nvidia’s driver page for 1080ti drivers, for linux 64 operating systems, and 387.12, 387.22, and 387.34 are the only recommended/certified v387 drivers.

In other words, the v387.26 which is distributed with Cuda Toolkit 9.1 is not listed as recommended/certified… Interesting.

Will try 9.1 again with the manually downloaded 387.34 driver. Thanks Rick for the tip.

-Patrick

Ok so I downloaded 387.34 onto my fresh Ubuntu install. Then I logged out and went into terminal mode with Ctrl-Alt-F1.

Then I killed my x server with sudo service lightdm stop.

I made my run file executable, and then tried to run it, it made me accept some terms, and then after I accepted the terms it threw up this message: “The distribution-provided pre-install script failed! Are you sure you want to continue?”

I decided to abort after reading a comment left by a NVIDIA forum moderator who said not to ignore this type of message. (The distribution-provided pre-install script failed! Are you sure you want to continue? - Linux - NVIDIA Developer Forums)

So now I think I will try with Nvidia Cuda Toolkit 9.0 and see if I have any issues.

If I am still having problems I think I will go down to 8.0.

I kept on running into problems trying to use the .run file. I ended up using the local install .deb version.

Gotcha - when I went to download the 387.34 driver from the Nvidia website it would only give me the .run file (Geforce Driver Results | NVIDIA)

When I use the link you provided (Titan-V Driver for Ubuntu 16.04 | 387.34 | Linux 64-bit Ubuntu 16.04 | NVIDIA) it gives me a .deb file but I was hesitant to use it because on that site, even though it says 387.34, it says it’s for the Titan not Geforce. Do you think is the same thing?

I don’t know. That’s something that NVIDIA tech support should answer.

Having said that, if you’re already in the habit of re-installing Ubuntu and have tried all kinds of things, it’s probably worth trying it. It’s hard to fathom that they would have two different drivers with the same version number.

Drivers can be packaged in a variety of ways, one of the basic distinctions being the type of install method, whether via a package manager method (varies by linux distro) or a standalone “runfile” installer, which can be used on more or less any distro.

Many of the (linux) drivers (but not all) offered via our download site at [url]Official Drivers | NVIDIA are using the runfile installer method.

But we do for various reasons sometimes post a driver packaged for a particular packaging system (e.g. .deb) and of course if you pull a driver from a repo it will be of the package type appropriate for that repo. On the driver download site, if you look at the operating system drop-down box (choice), if you make the choice for “Linux 64-bit” you’re going to get the runfile installer (since no distro is specified), but if a choice is available e.g. for “Linux 64-bit Ubuntu 17.04” then that is indicative of a packaging option where that driver is offered as a .deb package suitable for installation using the package manager system (e.g. apt-get) on Ubuntu 17.04

But a linux 384.111 driver (just to pick an example), once installed, should behave the same way, regardless of how you installed it.

Thanks txbob - we’ve been having issues with the .run files but the .deb files seem to work.

My question now is - can I use this driver:(Titan-V Driver for Ubuntu 16.04 | 387.34 | Linux 64-bit Ubuntu 16.04 | NVIDIA) for 1080ti? Its the .deb for 387.34 but it says it’s for Titan not GeForce.

When I go the GeForce page for the 387.34 driver, I can only get the .run file because I can’t specify Ubuntu, I can only specific Linux 64-bit.

Another question - I was looking at the Cuda installation guide: Installation Guide Linux :: CUDA Toolkit Documentation

It says the supported Linux kernel is 4.4

If I download Ubuntu 16.04 it comes with a higher linux kernel (4.10)… Does that mean 9.1 is going to be incompatible?

16.04 originally shipped with 4.4 but that was a while ago, now when you download Ubuntu 16.04 it comes with 4.10. If you run a full upgrade then it bumps up to 4.13…

Officially CUDA 9.1 was tested with the 4.4 kernel as specified in the install guide. Anything beyond that is at your own risk.

Unofficially, I know of no issues up through linux kernel 4.10 (if you scan that column in the installation guide, you’ll see that on some other distros, a kernel as high as 4.9 was tested, albeit not on the particular 16.4 distro). There are potentially issues with linux kernel 4.13 - you can find reports of those here on this forum.

For most kernel bumps, the driver should still install the same way and CUDA should still work. However there were significant changes in the 4.13 kernel and we’re still working through those.

For the previous question, you should be able to use the 387.34 Titan .deb on GeForce as well.

One thing to be aware of is that if you’ve ever previously installed a driver using the runfile method, you can’t later safely/reliably use a package manager install and vice versa. this is covered in the linux CUDA install guide (although it is really a driver issue) in the section entitled “handling conflicting install methods”. so if you say:

“we’ve been having issues with the .run files but the .deb files seem to work.”

one possible explanation for that might be if you previously used a .deb installer on that system (without doing a proper cleanup sometime later).

Beyond that, there are other differences, such as the need to remove/blacklist nouveau when using the runfile installer. The linux CUDA install guide has lots of useful information in it.

awesome thank you txbob! I am going through a fresh ubuntu install, and keeping with kernel 4.10.

Will update here if I have any issues.