Ubuntu 16.04 GUI login Loop after installing Nvidia Driver

Hi
I Have a Dell R 740 Poweredge Server with Nvidia Tesla P100 GPU Card. I Installed Ubutnu 16.04.4 Desktop over that server and everything working fine.

I need to install Cuda and nvidia driver for the same for research purpose.

After i install Nvidia driver, GUI goes to login loop.

I tried many ways , still not able to solve the issue.

Please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.
[url]https://devtalk.nvidia.com/default/topic/1043347/announcements/attaching-files-to-forum-topics-posts/[/url]

I am Suresh’s colleague.
Here is the link to the nvidia bug report:

http://paste.ubuntu.com/p/zwVzWwBpCF

I have turned off secure boot. nvidia-smi displays correctly in terminal mode.
The issue that remains is the login loop.
Thanks for any suggestions you may have.
Best wishes,

Your graphics card is a Matrox mga200, so by installing the Nvidia driver you broke your graphics stack. This has to be fixed first:

  • uninstall the nvidia .run driver using the --uninstall option
  • reinstall the xserver base package: sudo apt-get --reinstall xserver-xorg-core
  • reboot, the graphics stack should be functional again

Installing the Nvidia driver:

Installing cuda:

  • download the cuda .deb, not the .run installer
  • add it to your system as displayed on the download page
  • don’t install cuda
  • instead, run sudo apt install cuda-toolkit-10-0 (for cuda 9.2, use cuda-toolkit-9-2)

If this fails at any stage, please create a new nvidia-bug-report.log.

  • uninstall the nvidia .run driver using the --uninstall option
  • reinstall the xserver base package: sudo apt-get --reinstall xserver-xorg-core
  • reboot, the graphics stack should be functional again

When I did that, I had to drop to terminal (Ctl + Alt + F6) since the system booted into a blank screen with a blinking cursor on the top left.
At the terminal, I downloaded and installed the latest version of Nvidia driver (410.93), and the error I got can be seen here:
http://paste.ubuntu.com/p/pDVvRdrqcW/
Some posts suggested it may be a kernel vs driver mismatch, so I tried various other driver versions (upto 384.98)
Please let me know if you could help further.
Thanks.

That’a security measure of Ubuntu, you have to continue the installation.

I think I made a mistake here.
I did continue. Nvidia-smi gave the expected out.
So, I did nvidia-xconfig and rebooted.
Now I can’t drop to terminal, I just get a black screen, with a blinking cursor.

You can’t use nvidia-xconfig, you have a Tesla without outputs. Please delete /etc/X11/xorg.conf and reboot. If graphics doesn’t work then, create a new nvidia-bug-report.log

OK - thank you. The link is:

http://paste.ubuntu.com/p/m4mZS3yP6x

Please upgrade your HWE stack using

sudo apt-get install --install-recommends linux-generic-hwe-16.04 xserver-xorg-hwe-16.04

https://wiki.ubuntu.com/Kernel/LTSEnablementStack
to get the correct kernel driver for your system. Afterwards reboot and create a new nvidia-bug-report.log

Thanks. The new link is:

http://paste.ubuntu.com/p/WvFMBY3xrP

PS: Graphics is finally working now (no Login loop either). Able to work within Ubuntu UI.

Everything looks correct now. Be sure to follow my instructions on how to install cuda (installing cuda-toolkit, not cuda), otherwise you would break your system again.

Did not understand what you mean by:

  • add it to your system as displayed on the download page

Do the first three steps that are displayed there but not the fourth. i.e.

sudo dpkg -i cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-0-local/7fa2af80.pub
sudo apt-get update

instead of the fourth step, run

sudo apt-get install cuda-toolkit-10-0

this has to be done because the package ‘cuda’ is a meta package consisting of cuda-toolkit (‘the’ Cuda) and the nvidia driver which you already correctly installed.

Great. That’s done.
So, the path for nvcc etc. can now be added to complete the process.

Done running the tests under Samples/
Thanks very much for the support!

Hello again:

I have [b]nvidia-settings[/b] saying:
ERROR: Unable to load info from any available system

[sudo prime-select nvidia?]

No, that should not be installed. Please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post.