Unable to load the 'nvidia-drm' kernel module on Ubuntu 16.04
I got the error "Unable to load the 'nvidia-drm' kernel module" when I install driver from 'NVIDIA-Linux-x86_64-387.34.run'. (I have tried the version 375 and 384, but got the same error) My machine is ThinkPad S5 (with GTK 1050Ti) Ubuntu 16.04, I have disabled the secure boot in the BIOS.
I got the error "Unable to load the 'nvidia-drm' kernel module" when I install driver from 'NVIDIA-Linux-x86_64-387.34.run'. (I have tried the version 375 and 384, but got the same error)
My machine is ThinkPad S5 (with GTK 1050Ti) Ubuntu 16.04, I have disabled the secure boot in the BIOS.

#1
Posted 01/07/2018 06:23 AM   
When I execute NVIDIA-Linux-x86_64-387.34.run with --no-drm, the installation is completed, but nvidia-smi report message: NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. lspci | grep 'VGA\|3D' 00:02.0 VGA compatible controller: Intel Corporation Device 591b (rev 04) 02:00.0 3D controller: NVIDIA Corporation Device 1c8c (rev a1)
When I execute NVIDIA-Linux-x86_64-387.34.run with --no-drm, the installation is completed, but nvidia-smi report message:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

lspci | grep 'VGA\|3D'
00:02.0 VGA compatible controller: Intel Corporation Device 591b (rev 04)
02:00.0 3D controller: NVIDIA Corporation Device 1c8c (rev a1)

#2
Posted 01/07/2018 07:21 AM   
The driver is complaining about resource conflicts, upgrade your bios, it's quite outdated.
The driver is complaining about resource conflicts, upgrade your bios, it's quite outdated.

#3
Posted 01/07/2018 03:10 PM   
[quote=""]The driver is complaining about resource conflicts, upgrade your bios, it's quite outdated.[/quote] I have updated the BIOS to the latest version, but it can't solve my problem. -> Driver file installation is complete. -> Installing DKMS kernel module: -> done. ERROR: Unable to load the 'nvidia-drm' kernel module.
said:The driver is complaining about resource conflicts, upgrade your bios, it's quite outdated.

I have updated the BIOS to the latest version, but it can't solve my problem.

-> Driver file installation is complete.
-> Installing DKMS kernel module:
-> done.
ERROR: Unable to load the 'nvidia-drm' kernel module.

#4
Posted 01/08/2018 02:27 AM   
Ok, resource conflicts still there, then you will have to upgrade your kernel to 4.13/4.14. If this is a fresh install, maybe try with ubuntu 17.10 first.
Ok, resource conflicts still there, then you will have to upgrade your kernel to 4.13/4.14. If this is a fresh install, maybe try with ubuntu 17.10 first.

#5
Posted 01/08/2018 10:41 AM   
[quote=""]Ok, resource conflicts still there, then you will have to upgrade your kernel to 4.13/4.14. If this is a fresh install, maybe try with ubuntu 17.10 first.[/quote] Thanks I install ubuntu 17.10 as you suggest,there are some changes in 'lspci' lspci | grep "VGA\|3D" 00:02.0 VGA compatible controller: Intel Corporation Device 591b (rev 04) 02:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev ff) But the problem remains: -> Driver file installation is complete. -> Installing DKMS kernel module: -> done. ERROR: Unable to load the 'nvidia-drm' kernel module. ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com. What can id do next? Is this a bug of kernel or driver. This device is running properly under windows 10 (Nvidia driver version: 382.05)
said:Ok, resource conflicts still there, then you will have to upgrade your kernel to 4.13/4.14. If this is a fresh install, maybe try with ubuntu 17.10 first.


Thanks

I install ubuntu 17.10 as you suggest,there are some changes in 'lspci'
lspci | grep "VGA\|3D"
00:02.0 VGA compatible controller: Intel Corporation Device 591b (rev 04)
02:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev ff)

But the problem remains:
-> Driver file installation is complete.
-> Installing DKMS kernel module:
-> done.
ERROR: Unable to load the 'nvidia-drm' kernel module.
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.


What can id do next? Is this a bug of kernel or driver. This device is running properly under windows 10 (Nvidia driver version: 382.05)

#6
Posted 01/09/2018 02:23 AM   
Ok. The kernel 4.13 gets along better with your hardware, but the resource conflict is still there. [code][ 0.597420] pci 0000:02:00.0: BAR 6: no space for [mem size 0x00080000 pref] [ 0.597423] pci 0000:02:00.0: BAR 6: failed to assign [mem size 0x00080000 pref] ... [ 183.714840] NVRM: This is a 64-bit BAR mapped above 4GB by the system NVRM: BIOS or the Linux kernel, but the PCI bridge NVRM: immediately upstream of this GPU does not define NVRM: a matching prefetchable memory window. [/code] The nvidia card doesn't get the memory the bios tells the kernel it wants, it gets remapped and then the drivers fails. This is an incompatibility between kernel and bios. Should be reported to ubuntu/kernel bugzilla. Please try pci=nocrs as kernel parameter for a workaround.
Ok. The kernel 4.13 gets along better with your hardware, but the resource conflict is still there.
[    0.597420] pci 0000:02:00.0: BAR 6: no space for [mem size 0x00080000 pref]
[ 0.597423] pci 0000:02:00.0: BAR 6: failed to assign [mem size 0x00080000 pref]
...
[ 183.714840] NVRM: This is a 64-bit BAR mapped above 4GB by the system
NVRM: BIOS or the Linux kernel, but the PCI bridge
NVRM: immediately upstream of this GPU does not define
NVRM: a matching prefetchable memory window.

The nvidia card doesn't get the memory the bios tells the kernel it wants, it gets remapped and then the drivers fails.
This is an incompatibility between kernel and bios. Should be reported to ubuntu/kernel bugzilla. Please try
pci=nocrs
as kernel parameter for a workaround.

#7
Posted 01/09/2018 08:36 AM   
[quote=""]The nvidia card doesn't get the memory the bios tells the kernel it wants, it gets remapped and then the drivers fails. This is an incompatibility between kernel and bios. Should be reported to ubuntu/kernel bugzilla. Please try pci=nocrs as kernel parameter for a workaround.[/quote] I have added the pci=nocrs to boot parameter, but the the problem remains. Do you have any more suggestion for solving this problem? Thank you
said:The nvidia card doesn't get the memory the bios tells the kernel it wants, it gets remapped and then the drivers fails.
This is an incompatibility between kernel and bios. Should be reported to ubuntu/kernel bugzilla. Please try
pci=nocrs
as kernel parameter for a workaround.

I have added the pci=nocrs to boot parameter, but the the problem remains. Do you have any more suggestion for solving this problem?
Thank you

#8
Posted 01/09/2018 11:06 AM   
Please run sudo dmesg >dmesg.txt and attach that so I can take a look at it. It's only left to advance the kernel version to see if this is fixed in a newer version. Download kernel image and headers from Ubuntu and install them manually. You don't need to install the nvidia drivers, switch to intel (sudo prime-select intel) prior to installing the kernels then simply check if cat /proc/version returns the correct kernel version and then sudo dmesg |grep "BAR 6" still contains the 'failed' message. Test twice, with pci=nocrs set and unset. Holding 'shift' on reboot gets you to the grub boot menu where you can load your old kernels in case of boot failure. Start with kernel 4.14.12: [url]http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.12/linux-image-4.14.12-041412-generic_4.14.12-041412.201801051649_amd64.deb[/url] [url]http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.12/linux-headers-4.14.12-041412-generic_4.14.12-041412.201801051649_amd64.deb[/url] If that doesn't work, advance to kernel 4.15rc7 [url]http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15-rc7/linux-image-4.15.0-041500rc7-generic_4.15.0-041500rc7.201801072330_amd64.deb[/url] [url]http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15-rc7/linux-headers-4.15.0-041500rc7-generic_4.15.0-041500rc7.201801072330_amd64.deb[/url] If that doesn't work, there's only left to issue a bug report at kernel bugzilla.
Please run sudo dmesg >dmesg.txt and attach that so I can take a look at it.
It's only left to advance the kernel version to see if this is fixed in a newer version. Download kernel image and headers from Ubuntu and install them manually. You don't need to install the nvidia drivers, switch to intel (sudo prime-select intel) prior to installing the kernels then simply check if
cat /proc/version
returns the correct kernel version and then
sudo dmesg |grep "BAR 6"
still contains the 'failed' message.
Test twice, with pci=nocrs set and unset.
Holding 'shift' on reboot gets you to the grub boot menu where you can load your old kernels in case of boot failure.
Start with kernel 4.14.12:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.12/linux-image-4.14.12-041412-generic_4.14.12-041412.201801051649_amd64.deb
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.12/linux-headers-4.14.12-041412-generic_4.14.12-041412.201801051649_amd64.deb

If that doesn't work, advance to kernel 4.15rc7
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15-rc7/linux-image-4.15.0-041500rc7-generic_4.15.0-041500rc7.201801072330_amd64.deb
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15-rc7/linux-headers-4.15.0-041500rc7-generic_4.15.0-041500rc7.201801072330_amd64.deb
If that doesn't work, there's only left to issue a bug report at kernel bugzilla.

#9
Posted 01/09/2018 11:31 AM   
Sorry to say but after further research I can say that the BAR 6 issue is a red herring, unrelated to the Nvidia gpu not working. BAR 6 is just the option rom which is not needed anyway and failing to get mapped on most systems. I even think the NVRM message about the BAR is a red herring. So erase and rewind, the only error left then is: [code]nvidia 0000:02:00.0: Refused to change power state, currently in D3[/code] Which points to an acpi problem. Please run sudo acpidump > acpidump.txt and attach. Then try using kernel parameter acpi_osi=! acpi_osi="Windows 2009" Reboot and attach dmesg output.
Sorry to say but after further research I can say that the BAR 6 issue is a red herring, unrelated to the Nvidia gpu not working. BAR 6 is just the option rom which is not needed anyway and failing to get mapped on most systems. I even think the NVRM message about the BAR is a red herring. So erase and rewind, the only error left then is:
nvidia 0000:02:00.0: Refused to change power state, currently in D3

Which points to an acpi problem. Please run
sudo acpidump > acpidump.txt and attach.
Then try using kernel parameter
acpi_osi=! acpi_osi="Windows 2009"
Reboot and attach dmesg output.

#10
Posted 01/09/2018 01:19 PM   
[quote=""]Which points to an acpi problem. Please run sudo acpidump > acpidump.txt and attach. Then try using kernel parameter acpi_osi=! acpi_osi="Windows 2009" Reboot and attach dmesg output.[/quote] Wow, this method really working! After add 'acpi_osi=! acpi_osi="Windows 2009" ' to kernel parameter, the nvidia driver loaded successfully. Why that works? is this approach safe or stable? The attachement 'acpidump.txt' generated before adding kernel parameter, 'dmesg.log' after adding kernel parameter.
said:Which points to an acpi problem. Please run
sudo acpidump > acpidump.txt and attach.
Then try using kernel parameter
acpi_osi=! acpi_osi="Windows 2009"
Reboot and attach dmesg output.


Wow, this method really working! After add 'acpi_osi=! acpi_osi="Windows 2009" ' to kernel parameter, the nvidia driver loaded successfully. Why that works? is this approach safe or stable?

The attachement 'acpidump.txt' generated before adding kernel parameter, 'dmesg.log' after adding kernel parameter.
Attachments

acpidump.txt

dmesg.log

#11
Posted 01/09/2018 04:58 PM   
Basically, the parameter is instructing the kernel to tell the bios it is Windows 7 instead of Windows 10. This changes settings and methods used for power management etc. It can have adverse effects like backlight control not working, touchpad not working, slightly higher power draw on battery etc. If not, it's fine. I've taken a look at the acpidump and it looks like a variant of this bug: [url]https://bugs.acpica.org/show_bug.cgi?id=1333#c32[/url] Unfortunately the first fix (that was incorporated in kernel 4.13) wasn't fixing it. I hope the next try will be in 4.17 or so. Until then, use the workaround. Sidenote: your bluetooth is missing firmware to work: rtl_bt/rtl8822b_fw.bin and your webcam throws an error, don't know if that is affecting it.
Basically, the parameter is instructing the kernel to tell the bios it is Windows 7 instead of Windows 10. This changes settings and methods used for power management etc. It can have adverse effects like backlight control not working, touchpad not working, slightly higher power draw on battery etc. If not, it's fine.
I've taken a look at the acpidump and it looks like a variant of this bug:
https://bugs.acpica.org/show_bug.cgi?id=1333#c32
Unfortunately the first fix (that was incorporated in kernel 4.13) wasn't fixing it. I hope the next try will be in 4.17 or so. Until then, use the workaround.
Sidenote: your bluetooth is missing firmware to work: rtl_bt/rtl8822b_fw.bin and your webcam throws an error, don't know if that is affecting it.

#12
Posted 01/09/2018 05:50 PM   
another observation from the acpidump: I think instead of acpi_osi=! acpi_osi="Windows 2009" using just acpi_osi=Linux would also work.
Answer Accepted by Original Poster
another observation from the acpidump: I think instead of acpi_osi=! acpi_osi="Windows 2009" using just acpi_osi=Linux would also work.

#13
Posted 01/09/2018 06:00 PM   
[quote=""]another observation from the acpidump: I think instead of acpi_osi=! acpi_osi="Windows 2009" using just acpi_osi=Linux would also work.[/quote] You are right, 'acpi_osi=Linux' also work. I have reported this bug to ubuntu kernel. Thank you very much!
said:another observation from the acpidump: I think instead of acpi_osi=! acpi_osi="Windows 2009" using just acpi_osi=Linux would also work.

You are right, 'acpi_osi=Linux' also work. I have reported this bug to ubuntu kernel.

Thank you very much!

#14
Posted 01/10/2018 01:28 AM   
Scroll To Top

Add Reply