NVIDIA Drivers on Ubuntu 14.04

I finished flashing the Jetson from a host PC to a Nvidia Jetson Tk1 and now I’m on the process of installing an NVIDIA driver, namely a PCIe driver, on the host PC to enable GPU accelerated training.

With a host PC that has Ubuntu 14.04, I’ve installed the nvidia-375 driver according to instructions and rebooted afterwards. But after rebooting I get stuck in a login loop and so I have to erase the nvidia-375 driver in order to login.

Is there a reason as to why this is happening? Is it because my version of Ubuntu is outdated for driver and are there any other drivers I need to download prior to this?

Thanks

I use Fedora as host, so I’m not sure which other packages Ubuntu might have. In many cases the login loop you describe is because of the Nouveau driver being loaded…even if it doesn’t seem like it from some initial configuration, an initrd ramdisk might be putting this in place. In theory installing this through the package manager would take care of this, but some methods of packaging forget this. You may need to add something like “rd.blacklist=nouveau” to the kernel command line in addition to other methods of banning Nouveau. What kind of package or install method are you using?

You might want to drop into the grub command line before it boots (I think ‘e’ then exit with control-x when selecting the boot entry you want to edit) and go to text mode only via appending “systemd.unit=multi-user.target” to the Linux line (usually it starts with something like “linux16” and is a very long line which will line wrap, but keep it as a single line and just append to the end)…this should get you to text mode so you can manually try “systemctl isolate graphical.target”, and if this fails, you’ll go back to text mode. “lsmod” and “cat /proc/cmdline” will probably give some information. Also “dmesg | grep -i ‘(nvidia|nouveau)’”.

I’m using NVIDIA’s Two Days to Demo Guide in order to enhance deep learning into my Nvidia Jetson Tk1 and I’m following the directions under System Setup. I first installed the latest version of Jetpack which contained tools such as CUDA Toolkit, I then flash all of this to the Nvidia Jetson Tk1. After doing so I’m on the process installing PCIe drivers into the host pc. From the instructions, to do this it says

$ sudo apt-get install nvidia-375
$ sudo reboot

After I reboot I would check to see if the nvidia drivers are there using the command:
$ lsmod | grep nvidia
But this won’t work because when I install the designated nvidia driver (nvidia-375) it cause my Ubuntu OS to stay in a login loop. (It also doesn’t show up even after I install nvidia-375 before rebooting).

From what you’re saying, the Nouveau driver is causing the loop and that I need to disable it? Or that I should boot to the grub command and manually try to banning Nouvea?

I haven’t seen the demo, and packages for NVIDIA video drivers are better supported under x86_64 Ubuntu versus Fedora, so some of what I know will differ slightly.

Yes, presence of Nouveau in any way will lead to the NVIDIA driver not loading. This results in the GUI starting to come up, and then crashing. Disabling Nouveau may differ some on Ubuntu, but on every desktop platform it might be a bit more involved because desktop installations tend to use the initrd “initial ramdisk”…even if you blacklist Nouveau from “/etc” (and the NVIDIA package itself will do this) it may be necessary to further blacklist Nouveau due to the initrd being an essentially independent initial operating system.

In order to fix it you have to be able to log in. One way is to have good timing and hit the ctrl-alt-F1 or similar hot key to get to the console (no GUI, thus no crash…then ALT-F7 to get back to GUI for testing). Ubuntu 14.04 seems to use older init scripts instead of systemd, so here is how to get text mode boot temporarily from the grub prompt in 14.04:

When you start up and grub briefly shows up highlight the entry for your Ubuntu boot, and hit the ‘e’ key. This gives you a primitive editor (it’s annoying that automatic key repeat doesn’t work…there’s a lot of tapping keys to move around). There will be a line which looks something like this:

linux   /vmlinuz-3.13.0-135-generic root=/dev/mapper/vg_sda5-ubuntu quiet splash ro

…edits here are temporary until the next boot.

I tend to remove the “quiet” and “splash” part…I want to see boot text. I also append to the end (space separator):

rd.blacklist=nouveau

…this might get graphics up if everything else is correct.

If your system still doesn’t boot correctly you may want to go to text mode by adding “text” to the end of that same line in grub. You’d hit the ‘e’ key, go to that line I mentioned, remove splash and quiet, then append “text” to the end. Probably you want to have the “rd.blacklist=nouveau” in the line as well.

Don’t use the enter key…you’d just be introducing a line break. Use ctrl-x to execute with that edited entry. The edits go away upon next boot. If you must go to text mode you may want to make copies of dmesg output and lsmod output. scp can be used to copy to another computer (or SD card, or whatever you want).

If all works well then you can probably make some edits and update grub and you’ll be set.

I’m not quite following about how to log in to turn on and then off the GUI

When exactly do I Ctrl + Alt + F1? After installing the NVidia-375 driver and rebooting, my desktop takes me to the log in and despite entering the right username and password, it won’t let me log in and continues to go into a loop. Am I suppose to do it before the log in shows up or when it’s in the process of doing so.

Also what do you mean by “the entry for your Ubuntu boot” and the “e-key”. Sorry, but I’m not familiar with Ubuntu or Linux and have only been using it for less than year.

The GUI libraries and files are not needed for login if you don’t use the GUI. In your case you cannot use the GUI because it crashes.

Your GUI can be switched to the first text-based console if in the GUI via “CTRL-ALT-F1” key combination, or from the GUI into the second text-based console via “CTRL-ALT-F2” key combination. If you are in text mode console without GUI then “ALT-F3” goes to the third text-mode console, “ALT-F1” to the first text-mode console. When you are at the password screen literally hold down the control and alt keys, and tap the F1 key (or F2, or F3, so on).

The 7th console is the GUI…from in a text-mode console “ALT-F7” goes to the GUI. Note that these are just key bindings…the GUI already binds ALT-F1, so they had to add an extra key to make it unique…CTRL-ALT-F1 if in a graphical menu is the same as just ALT-F1 if already in a text-only console.

If you are in a console you can edit and do everything you need to do…just not with graphical programs. The nano editor is an example of this…a way to edit files without the GUI.

During the start of boot, right after your BIOS is done, you should briefly see a boot menu. This is the GRUB bootloader. If you hit the “e” key (literally tap the letter ‘e’ on the keyboard…it means “edit”), then you drop into a command line editor of the text that booting with that entry would have forwarded. With this you can pass arguments to the kernel before it ever starts. You can test things without much worry because if the edit is wrong, then the next reboot will no longer have the edits.

If you are editing and you want to boot with your edits, then hit “control-x” key combination…not the enter key. Literally, hold down the control key, tap the ‘x’ key, and it will boot with those arguments. You are safe to just cycle the power here if you don’t like things, or control-c to abort and go back to the unedited menu.

The first reason to do this is if you can’t get to a console. The line of text in GRUB which launchs Linux can have “text” appended to it (with a space between) and GUI will no longer attempt to run.

The second reason is that the basis of the crash is probably because Nouveau is trying to load from initial ramdisk. You can ban Nouveau by adding “rd.driver.blacklist=nouveau” to the end of the linux line. This will stop even the initial ramdisk from loading this pesky competing driver.

All of this is so you can log in and test if things work better when Nouveau is disabled via the kernel’s command line.

Thanks for the explanation, I’ve followed up until booting the arguments but I haven’t tried since I currently don’t have access to my host PC. I would like to clarify ahead of time of when I will have to make changes in the primitive editor, that I would have to remove “quiet” and “splash” from the line

linux /vmlinuz-3.13.0-135-generic root=/dev/mapper/vg_sda5-ubuntu quiet splash ro

And add rd.driver.blacklist=nouveau after the ro?

Another tip I would like to know is whether to perform all of this before or after I install the nvidia drivers on my host PC.

Removing “quiet” and “splash”, though not manadatory, will much improve what you can see going on during boot. Boot text won’t show up if these options are present. You should probably remove them when debugging this issue at least long enough to see what the text is once when booting. Remember that when you use the grub command line to edit an entry this is not permanent…it is for testing or emergency use since the actual permanently stored command is elsewhere.

It probably doesn’t matter where the “rd.driver.blacklist=nouveau” goes so long as it is after the “linux” word (first word of that line other than indent) of the line and has a space between it and other entries. A space at the end of the line has no purpose, but it also is not a problem. The “linux” word at the start of the line is a command, everything else is an argument to the Linux kernel. FYI, you wouldn’t put the actual quotes in, just the text.

The thing to be careful about on grub command line is to note it gives you a hotkey reference at the bottom. If you look it’ll remind you that the letter ‘e’ gets you into an editor for the currently selected boot entry…once in it’ll remind you that “control-c” exits without change, and that “control-x” executes that entry as edited…this latter is what you want if you’ve completed the edit you want…to execute or test the entry and begin boot…it’s the “go” button. Don’t use the “enter” key to tell it to go…this will just split a line which was intended to be a single long line (but which line wraps so it looks like multiple lines).

Because you can’t move around by holding an arrow key down (you have to tap it once for each character or line to move the cursor around) I suggest starting with a large coffee :P

FYI, the NVIDIA video driver is loaded as a module. This means the code isn’t present until that module loads. This also means that if some other driver has locked onto that function, then the NVIDIA driver can’t load even if it is available. Because an initial ramdisk is used which loads certain modules prior to having access to the boot directory is not edited when you edit the normal boot stuff via the NVIDIA installers you will find that anything in this initial ramdisk which competes with the NVIDIA module will break the loading of the NVIDIA module (the “Nouveau” driver in this case). Passing the “rd.driver.blacklist=nouveau” on the kernel’s command line is visible to the kernel before the initial ramdisk is loaded…this tells the kernel to not load this driver even before the kernel starts running. If Nouveau is getting in the way (and it always gets in the way on Fedora where Fedora is rather standard) this should answer the question when the GUI works.

Note that if you add “text” to the kernel command line it won’t boot to GUI…this was just in case you need to access the environment and do things like use the package editor. I think on the older distributions you can still run “sudo init 5” to get to the GUI after you’ve made a change and are still in purely text mode. You can probably try the grub edit with addition of the “rd.driver.blacklist=nouveau” as your first test. Removing “text” and “splash” will aid seeing problems, but it otherwise has no functional effect.

I added the rd.driver.blacklist=nouveau after the word Linux and removed “quiet” and “splash” and booted with these settings with Ctrl-x when after I do so it freezes until the computer goes into sleep mode and even when I power it on or restart it the login loop still happens

My line of linux looks like this

linux /boot/vmlinuz-4.4.0-109-generic root=UUID=24c92eca-0bb0-4588-9ff8-de967cbe5ee6 ro quiet splash $vt_handoff

I edit it to like like:

linux rd.driver.blacklist=nouveau /boot/vmlinuz-4.4.0-109-generic root=UUID=24c92eca-0bb0-4588-9ff8-de967cbe5ee6 ro $vt_handoff

What do you think causes it to freeze when I boot with these changes? Also I’m doing this NVidia-based project for school and is using a computer that belongs to the school if this makes a difference.

There’s a lot which needs to be examined to know what might be missing or incorrect. First, do you have a copy of “/var/log/nvidia-installer.log”? This might not be there for a “.deb” package version, not sure…if it is there it would be very useful.

What do you get from “cat /proc/cmdline” (preferably after adding the rd.driver.blacklist=nouveau edit)?

How did you originally install…was this with a command from “apt-get”, via a downloaded “.deb”, or with a “.run” version?

What files are listed in “/etc/modprobe.d/”? Use “ls /etc/modprobe.d/*”. I’m hoping one has “nouveau” in its name…this would show a level of success in installing the NVIDIA driver (not complete, but it would show an attempt). If this file exists, what is its content? You can use “cat blacklist-nouveau.conf”, or just copy it and rename it as a “.txt” file and attach it to your post (you can attach “.txt” files to a post in the forum after the post exists…hover the mouse over the quote icon in the upper right and a paper clip icon will show up). Even with graphics failing you can use the “scp” command to copy a file over the network to another computer if it is more convenient.

What do you see from “sudo lshw -c video”?

And finally, perhaps the most important part, can you save a copy of your “/var/log/Xorg.0.log” and rename it to something ending “.txt”, then attach it here?

FYI, I doubt a school computer is any different than any other computer relative to video drivers.

Are these commands, such as “/var/log/nvidia-installer.log” and “cat/proc/cmdline” something that you enter into Terminal in the GUI or do you find them through a different way? Calling them through Terminal, most say there is not file or directory.

I installed through sudo apt-get install nvidia-375 mentioned from a previous post

From inputing “sudo lshow -c video” into Terminal, this is what shows:

*-display
description: VGA compatible controller
product: RV620 LE [Radeon HD 3450]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:01:00.0
version: 00
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
configuration: driver=radeon latency=0
resources: irq:28 memory:e0000000-efffffff memory:f7df0000-f7dfffff iopor

These are typed into any terminal (a.k.a., “command line”…GUI is not required…just a place to type in commands).

Does this computer have both an integrated graphics card and an add-on? I ask because it shows as an AMD Radeon graphics card…which is not capable of using CUDA since it does not have the required hardware. In the case of a laptop it isn’t uncommon to place an option in the BIOS to use just integrated even if there is a higher performance video card…it is a way to consume less power.

If there is such an option, then you need to enable this in the BIOS before boot. If you have only the AMD graphics, then this is why the driver isn’t working…the hardware does not use the NVIDIA driver.

I have a tiny laptop with integrated Intel graphics…it too cannot use the NVIDIA drivers. It isn’t a problem when just installing to the Jetson since the laptop itself does not need to run CUDA. In the JetPack3.2 pre-release though it seems you have to tell it to install NVIDIA video drivers or else it won’t allow moving on to just install to the Jetson (I think this is a bug in JetPack3.2 pre-release…hopefully the final release won’t mandate installing CUDA and NVIDIA graphics to the Ubuntu host PC…3.1 already does this correctly).

Note that “/var/log/nvidia-installer.log” is a file, and not a command. If you “cat /var/log/nvidia-installer.log”, this is a command to echo the content to the terminal. The “cat /proc/cmdline” is a command to echo the content of a file.

To know whether my computer has an add-on, I would have to look at the back of the tower(using a desktop) to check if it has additional connectors. The tower I’m using is a Dell Optiplex 980 and from the looks of it there doesn’t seem to be any additional connectors or add-ons. Since you said the lone AMD Raedon graphics card is not strong enough to handle CUDA, is that why my computer crashes after installing and rebooting?

Also can you elaborate by what you said of “placing an option in the BIOS”?

It isn’t about strength of the GPU (though NVIDIA GPUs are the high end), it’s about the driver required to access it. The NVIDIA driver cannot be used on non-NVIDIA GPU hardware…the two architectures are completely alien at the hardware level (the drivers are what adapt the two hardware architectures to use the same X11 server interface). So it is going to graphical mode and needs to load a driver, but it can’t…no matter how many ways you tell the system to use the NVIDIA driver it will always fail on the AMD GPU. You’ll need to use the AMD driver to get the GUI to not crash (you can still enter commands, such as for installing that driver, if you go to text mode…this is the part where I suggested editing the GRUB command line and adding the option “text” at the end, or else CTRL-ALT-F1).

FYI, CUDA is native to NVIDIA GPUs…if you say CUDA, then you are implying NVIDIA hardware.

So no matter how I try to install the NVIDIA driver, it will not work because the hardware that I’m using is non-NVIDIA GPU hardware? You said I can still enter commands by going into text mode or CTRL-ALT-F1 but I won’t be able to interface with the GUI (like seeing the desktop screen) because of the crash from installing NVidia drivers?

In this case I would need to use a different computer that’s capable of using NVIDIA GPU hardware?

Until you get your old drivers back in place you can’t use the GUI (which the text mode can help you fix without reinstalling since it doesn’t use those drivers). You need the AMD drivers since the GPU is AMD…then GUI would work again.

Plus you must use an NVIDIA GPU with the NVIDIA driver before you can use CUDA on the PC host…this is all host side (Jetsons all have an integrated NVIDIA GPU). You can flash a Jetson using any 64bit Linux PC if you use command line flash tools, but you can’t install any of the CUDA libraries or CUDA applications on the PC without the NVIDIA drivers. JetPack could be used on this PC with the AMD GPU to install everything to a Jetson if the PC is the correct Ubuntu version…but JetPack would need to not install any CUDA related packages to the PC since anything CUDA would need to install the NVIDIA GPU driver.

One choice is not run CUDA apps on the PC (you can still run CUDA on the Jetson); another choice is to install an NVIDIA video card on the PC; the other choice is to find another computer which has the NVIDIA video card.

I appreciate the help and responses I think I now know that the problem is but I’d like to ask how would I install an Nvidia video card on my host PC, or how can I find a computer that has NVIDIA video cards.

Also are Nvidia video cards the same as Nvidia graphic cards?

Yes, video, graphics, and GPU are basically just alternate ways of saying the same thing…though one might imply a particular use more than the other.

Any system with a PCIe x16 slot and sufficient power supply will do the job. Typically you would want one with at least 6GB of video RAM. Probably the one of the best choices if you don’t need extreme performance would be the GeForce 1060 with 6GB of RAM (there is a 3GB model…avoid that). This card has sufficient RAM, is power efficient and does not take a huge power supply, and has good all around performance (despite being at the lower end of cost it is well above average performance and does well even in some more demanding games)…and is less expensive that many of the other video cards in that performance range.

If you run the “sudo lshw -c video” command and see a vendor of NVIDIA, then you have an NVIDIA video card/GPU…there are different versions, but that’s the basic. Some newer versions of CUDA do not run on older hardware so you’d have to know more about exactly what you want to run to know for sure if an older system can do what you want.

Do keep in mind that if you are not installing or running any of the CUDA programming on the host itself that you won’t need this. Usually you’ll do training on the host, and execution of the net on the Jetson. If you are not doing any training, then the host is not necessarily needed so far as GPU requirements go.