[Solved] CUDA 8.0 on Ubuntu 16.04 - GPU not available

I’m trying to get CUDA 8.0 running on a new Ubuntu 16.04 box to be used with TensorFlow and Theano. For the most part I’ve been able to navigate through the install by sticking to NVIDIA’s recommended setup - installing their device drivers, removing those from the Ubuntu package, etc. I now have CUDA and cuDNN installed, but both Theano and TensorFlow complain about no GPU being available.

When I run nvidia-smi, though the device clearly exists. Any thoughts on this?

I’m happy to post details on my configuration. This is with a new GTX 1080 Founders Edition, but I might try my old 750 if I continue to have issues. I’ll also post any updates.

As a followup, it looks like there was a device driver incompatibility. Unfortunately it looks like the 16.04 Debian package on the NVIDIA website has a 361 driver dependency, and I want to use 367 (367.18). I’m going to try the run file instead. Stay tuned.

And once I installed with the run file (and didn’t install the drivers this way) everything is working! I now have CUDA 8.0 running under TensorFlow and Theano.

That’s great!

Are you using CuDNN v5? I’m about to try to reproduce your build.

Also, are you using the run files for both the 367 driver and Cuda 8?

Never mind, resolved.

orionr> And once I installed with the run file (and didn’t install the drivers this way)

So you installed Cuda 8 using the run file, correct?

How did you install the 367 driver?

Any help is appreciated.

Here’s what I’ve done and I’m pretty sure everything is working (recently, I’ve been getting artifacts on my screen, but I’m not 100% where that’s coming from. My guess is the 367 beta drivers/maybe something I’m doing with TensorFlow as it only comes up once I start using the GPUs.)

Current Setup:
2x GTX 1080, Ubuntu 16.04, 367.18 drivers, Cuda 8, cuDNN v5, TensorFlow r0.8

Steps:

  1. Download cuda-8.run file
  2. Download Nvidia-367.18-driver.run file
  3. Remove old drivers (apt remove --purge nvidia*)
  4. Remove old nvidia.deb packages (just in case that's necessary?)
  5. Remove my cuda folders (/usr/local/cuda*) (again, just in case)
  6. reboot
  7. Graphics probably won't work now (they didn't for me), so use tty (ctrl-alt-1) or ssh (might want to set that up before doing all this)
  8. (I might have run the Nvidia-367.18.run file here, followed with a reboot, but I can't remember)
  9. run the cuda-8.run file. Decline on installing the old drivers (362?) (I did this a few times, so I think that's what I did. :)
  10. reboot
  11. run the Nvidia-367.18.run file
  12. reboot

Note: I didn’t use any .debs

At this point, the X session looked fine, and I was able to run normally / install the other software I wanted (cuDNN and TF).

Hopefully that helps!!! Let me know how it goes.

Excellent! Thank you very much for the details.

Will try this and report back (probably have to wait for weekend).

Thanks again for taking the time to spell out the steps.

Nope…

Every time I install the 367.18 driver it crashes X11.

Fastest way to crash:

  1. Fresh install of Ubuntu 16.04
  2. blacklist Nouveau
  3. boot to terminal
  4. install 367.18 (you can run xconfig or not, doesn’t matter)
  5. reboot

Try to log in, keeps returning to log in prompt.

BUT, I installed Linux Mint (Ubuntu 14.04) and it all seemed to work more smoothly. I’m up and running, installing libraries, fixing paths and links, etc.

Thank you for your information, it was helpful.

(P.S. You got TWO 1080’s?? Dang.)

I have an ASUS K501U with a GeForce GTX 950M. It’s one of those computers with the crazy Intel/NVIDIA graphics hybrids (displays through Intel, computes through NVIDIA or something like that). I am running Ubuntu Mate 16.04. jholloway’s method seemed to work for my case. I can verify this is the process.

    Steps:
  1. Download cuda-8.run file
  2. Download Nvidia-367.18-driver.run file
  3. Remove old drivers (apt remove --purge nvidia*)
  4. (Remove old cuda files (e.g. apt remove --purge libcuda*)
  5. Remove old nvidia.deb packages (just in case that's necessary?)
  6. Remove my cuda folders (/usr/local/cuda*) (again, just in case)
  7. reboot
  8. Run the Nvidia-367.18.run file
  9. run the cuda-8.run file. Decline on installing the old drivers (361?)
  10. reboot

After testing with the examples, 0_Simple/clock works fine, but I got an error with 5_Simulations/fluidsGL:

./fluidsGL: error while loading shared libraries: libcufft.so.8.0: cannot open shared object file: No such file or directory

Oh, right. Update your LD_LIBRARY_PATH guys. :D

Also, if you need to, update the /etc/ld.so.conf.d/cuda.conf with “/usr/local/cuda/lib” and ‘sudo ldconfig’.

After that, the CUDA part runs fine.

(However, because of my funky dual graphics thing, I can’t display the fluidsGL stuff, but that’s a different problem entirely.)

Thanks for all the followups! For the record jholloway’s steps are exactly the same as those I went through in the end. Kudos for the details!

Hi - quick question about TF installation with CUDA 8… Did you need to do a custom build since there doesn’t seem to be binaries for CUDA 8?

I attempted a pip wheel custom build - everything compiled smoothly, but got an error once I tried installing: “tensorflow-0.8.0-py3-none-any.whl is not a supported wheel on this platform.” Could this be because I’m still on Ubuntu 14.04?

@kbl424 , Did you try release 0.9 ? And were you able to install tensorflow?

I have CUDA 8 and TensorFlow 0.9 running on Ubuntu 16.04 with the NVIDIA Driver v367.27.

You’ll have to build TF from source to support the CUDA 8, but other than that it should be pretty straight forward.

Note: If you’ve never built from source, it may take a while. I’m running a 12-core CPU at 3.5GHz, and it takes roughly 15 mins, so you can get a guess from there. (one nice thing about bazel is that it caches the build process along the way, so if you do hit an error, you don’t have to start from scratch. :)

Hello everyone,

By following the above instructions I have been able to successfully install Nvidia CUDA drivers, detect my GT720M card and use it for computation.

The problem is now after I login, my GUI desktop environment(Cinnamon) crashes and a basic GUI replaces lovely Cinnamon.

The problem I think is my NVIDIA card is not being used for GUI services because when I issue the command

nvidia-smi

, the memory usage is 0MB until I run any computation which causes it to increase.

Is there some way that I can start using inbuilt Intel GPU for base OS and my NVIDIA only for computation?

My Specs:
Linux Mint 18, (based on Ubuntu 16.04)
CUDA 8.0
GPUs- In-built Intel 4400 and Nvidia 720M

I’m also trying to build Tensorflow 0.9 with cuda 8.0, NVIDIA driver 361.78 on Ubuntu 16.04 ppc64le. And I’m getting below error -

external/eigen_archive/eigen-eigen-d02e6a705c30/unsupported/Eigen/CXX11/src/Tensor/TensorBroadcasting.h(271): internal error: assertion failed at: “/dvs/p4/build/sw/rel/gpu_drv/r361/r361_00/drivers/compiler/edg/EDG_4.10/src/folding.c”, line 9819

1 catastrophic error detected in the compilation of “/tmp/tmpxft_000030c2_00000000-9_batch_norm_op_gpu.cu.compute_52.cpp1.ii”.
Compilation aborted.
Aborted
ERROR: /home/nishidha/pkgbuild/tensorflow/tensorflow/tensorflow/core/kernels/BUILD:1223:1: output ‘tensorflow/core/kernels/_objs/batch_norm_op_gpu/tensorflow/core/kernels/batch_norm_op_gpu.cu.pic.o’ was not created.
ERROR: /home/nishidha/pkgbuild/tensorflow/tensorflow/tensorflow/core/kernels/BUILD:1223:1: not all outputs were created.
Target //tensorflow/tools/pip_package:build_pip_package failed to build

It looks like a driver issue but not sure. So, I wanted to know if tensorflow 0.9 with cuda 8.0 needs a specific version of nvidia driver. And how to get nvidia driver 367.27 version? I tried to find it for Tesla K80 at for power8, but could not find 367 version.

361.78 should be OK for CUDA 8. This doesn’t look like a driver issue to me.

Can you indicate the instructions you followed to build tensorflow?

Sure, here are the steps I followed to build tensorflow -

  1. Built Bazel 0.2.0
  2. Cloned tensorflow 0.9
  3. ./configure
    GPU Support - Yes
    Google Cloud Support - Yes
    Rest options - default
  4. bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package (for GPU Version)

Also, one thing to update is I could build tensorflow 0.9 with cuda 8.0.27 and driver 361.62 on Ubuntu 16.04 ppc64le. Although cuda 8.0.27 has some issue with gcc 5.4 but with a workaround for this problem, tensorflow did build with this setup.

Curious about how well this works in 14.04