GTX1650 (notebook) not working on Ubuntu16.04 (black screen & login loop)

My laptop is Dell Inspiron 7950 with hybrid graphics cards (Intel and NVIDIA GTX1650), system is Ubuntu 16.04 with kernel 4.15.10

I’ve tried both ppa and runfile methods to install the driver, and they result in different kind of failure:

  1. ppa method: I install the driver of version 418.87 through apt-get install nvidia-418, after rebooting it can’t show the login screen (balckscreen with nothing) but generate a sound of booting. I turn into tty1, find out nvidia-smi works well and shows the process correctly, so I guess the nvidia card works well, there’s just something wrong about X screen service. The /etc/X11/xorg.conf file is shown below:
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 418.87.00

Section "ServerLayout"
    Identifier     "layout"
    Screen      0  "nvidia" 0 0
    Inactive       "intel"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "intel"
    Driver         "modesetting"
    Option         "AccelMethod" "None"
    BusID          "PCI:0@0:2:0"
EndSection

Section "Device"
    Identifier     "nvidia"
    Driver         "nvidia"
    BusID          "PCI:1@0:0:0"
EndSection

Section "Screen"
    Identifier     "intel"
    Device         "intel"
    Monitor        "Monitor0"
EndSection

Section "Screen"
    Identifier     "nvidia"
    Device         "nvidia"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "on"
    Option         "IgnoreDisplayDevices" "CRT"
    Option         "ConstrainCursor" "off"
    SubSection     "Display"
        Depth       24
        Modes      "nvidia-auto-select"
    EndSubSection
EndSection
  1. runfile method: I download the suitable runfile from official website (both 418.74 and 418.88 have been tried, 430 needs a kernel of version at least 5.1 so I give it up). After running the runfile with “–no-opengl-files” options (prevent from login loop) and rebooting, the resolution (sometimes) becomes pretty low and the nvidia card seems not to be loaded. The evidence is that nvidia-smi works but can’t show any running process, and nvidia-settings says “unable to load info from any available system”, they’re shown below:
shawn@shawn-Inspiron-7590:~$ nvidia-smi 
Wed Aug 21 20:01:52 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.88       Driver Version: 418.88       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1650    Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   59C    P0     3W /  N/A |      0MiB /  3884MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
shawn@shawn-Inspiron-7590:~$ nvidia-settings

ERROR: Unable to load info from any available system

I’ve been struggling with this for 3 days. I’ve also tried upgrade/downgrade kernel, installing the driver through cuda, prime-select, and none of them work yet. I am literally driven crazy by it. I’ll REALLY APPRECIATE it if anyone could help me and save my life!

UPDATE 1: two bug report files have been uploaded, the resolution becomes normal when I try runfile method this time but nvidia card still doesn’t work

UPDATE 2: Actually, the driver once worked well about one and a half month ago, but I also tried pretty hard then. I once tried ppa install 418 and 430, they both result in normal work but login loop, then I purge them and tried 430 runfile which failed (forget about detail). Finally, I tried 418 runfile with --no-opengl-files option, the X service worked well but nvidia-smi showed no process and nvidia-settings couldn’t load from any available system. Then I tried to install cuda through ppa, and it succeeded, all came alive. Few days ago my kernel upgraded itself automatically so the nvidia driver broke down. I was so naive that I immediately uninstalled almost everything (nvidia driver, cuda). Now I’m suspecting that I didn’t accomplish a complete uninstallation, which results in the problems nowadays.

UPDATE 3: [problem solved]
According to the official reply, The reason for the problem on my laptop seems to be some bugs of install package. For those who meet the similar problem, I’ve arranged a list about what I’ve done since our formal conversation is too long:

  1. make sure that the broken driver is removed completely
  2. install nvidia-430 through ppa source ppa:graphics-drivers/ppa (see post #9)
  3. fix the package bug manually (see post #19)

Work for my nvidia & intel hybrid cards laptop.
Good luck!
nvidia-bug-report-ppa.log.gz (1.11 MB)
nvidia-bug-report-runfile.log.gz (1.04 MB)

Please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.
[url]https://devtalk.nvidia.com/default/topic/1043347/announcements/attaching-files-to-forum-topics-posts/[/url]

Thanks for your reply. I’ve already uploaded the bug report, please check it.

Please stick to the ppa method, since this is an Optimus notebook PRIME has to be used for that and the method in Ubuntu 16.04 is very sensitive to files being properly registered and in the expected place.
Are you using plain 16.04, i.e. lightdm/Uninty or did you change the DE?

Thanks for your reply. Just the plain 16.04 downloaded from official website, X service is lightdm. The ppa source is ppa:graphics-drivers/ppa. Do you need me to provide any further file?

I presume you can switch to VT and log in there? If so, please run
ps a | grep X
this should give you a line containg something similar to
/usr/libexec/Xorg vt7 -displayfd 3 -auth /run/user/111/gdm/Xauthority -background none -noreset -keeptty -verbose 3
this is from my system running gdm instead of lightdm, the important part is the -auth section, this will be different since you’re using lightdm.
Change to a root prompt
sudo -s
then try to run things on the lightdm display:
DISPLAY=:0 XAUTHORITY=<what comes after -auth> xrandr --listproviders
please post the output of that.
DISPLAY=:0 XAUTHORITY=<what comes after -auth> xrandr xrandr --setprovideroutputsource modesetting NVIDIA-0
does the lightdm session on vt7 (switch back to it) come alive after that?

I’m not quite certain about your “run things on the lightdm display” and I don’t understand VT either (sorry but I’m not an ubuntu expert). Since after ppa installing there’s only a black screen so I ran these commands on tty1 (ctrl+alt+F1).
Unfortunately my laptop didn’t come alive after I ran the commands you provide, and there were some errors about the last command.
Because I can’t use clipboard on tty1, I took a picture of the inputs and outputs then type them by hand as followed. In case there are errors caused by typo, the picture is in the attachment too.

shawn@shawn-Inspiron-7590:~$ ps a | grep X
1158 tty7     Ssl+   0:00 /usr/lib/xorg/Xorg -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
1918 tty1     S+     0:00 grep X
shawn@shawn-Inspiron-7590:~$ sudo -s
[sudo] password for shawn:
root@shawn-Inspiron-7590:~# DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 xrandr --listproviders
Providers: number : 2
Provider 0: id: 0x23c cap: 0x0 crtcs: 0 outputs: 0 associated providers: 0 name:NVIDIA-0
Provider 1: id: 0x46 cap: 0x2, Sink Output crtcs: 3 outputs: 4 associated providers: 0 name:modesetting
root@shawn-Inspiron-7590:~# DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 xrandr --setprovideroutputsource modesetting NVIDIA-0
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  140 (RANDR)
  Minor opcode of failed request:  35 (RRSetProviderOutputSource)
  Value in failed request:  0x23c
  Serial number of failed request:  16
  Current serial number in output stream:  17

What’s worse, I can’t use the former graphics console with intel card by using apt-get remove --purge nvidia* to uninstall nvidia driver after running the command you offer.
I try to enter recovery mode and turn on the failgraphics mode, the graphics console shows up indeed, but I fall into a login loop.
So could you please tell me how to get back to the normal graphics console? There are still some important files and applications in ubuntu.

Ok, there’s something in dmesg I overlooked before:

[    4.891935] nvidia-modeset: Version mismatch: nvidia.ko(418.87.00) nvidia-modeset.ko(418.67)

which makes it clear why this fails.
This raises a different question, though:
You installed 418.88 from .run installer. You also installed 418.67 and 418.87, those are tesla drivers, where did you get them from? Did you also install cuda?
The Ubuntu graphics drivers ppa provides 418.56 and 430.40 for 16.04:
https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa
So you will have to get rid of the installed drivers first, e.g.

sudo apt purge nvidia-*

then add the ppa:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update

and install the driver

sudo apt install nvidia-430

Thanks for your reply. 418.87 is installed through ppa. I don’t know where 418.67 comes from, maybe because I tried to install cuda to fix this problem. Now the problem is that I can’t uninstall the driver after running the commands you provide before, I guess that’s because I changed the X service configure? Could you please tell me how to overcome this? The error information is shown below:

E: Unable to locate package nvidia-bug-report-ppa.log.gz
E: Couldn't find any package by glob 'nvidia-bug-report-ppa.log.gz'
E: Couldn't find any package by regex 'nvidia-bug-report-ppa.log.gz'
E: Unable to locate package nvidia-bug-report-runfile.log.gz
E: Couldn't find any package by glob 'nvidia-bug-report-runfile.log.gz'
E: Couldn't find any package by regex 'nvidia-bug-report-runfile.log.gz'

BTW, I remember seeing a post saying 4.30 is for kernel over 5.1.0, but my kernal is 4.15.10 since I have to use Ubuntu 16, so I doubt whether installing nvidia-430 will work for me.
Besides, I also suspect that there will be a login loop even if I succeed to install nvidia through ppa, since the problem has occured in failgraphics mode and some old kernel before.

430 will work for all kernels, 418 only for kernels <5.0.
The error messages you’re seeing is because you have files beginning with “nvidia-” in the directory where you run that command. Change to an empty directory, then run the command again. Or just run rm nvidia-* beforehand.

Thanks for your reply. I’ve tried updated ppa and run apt-get install nvidia-430, but I got 430.26 rather than 430.40. The problem is that after rebooting it showed a black screen with a dialog saying “The system is running in low-graphics mode” and “Your screen, graphics card, and input device settings could not be detected correctly. You will need to configure these yourself”.
Besides, I entered tty1 and ran nvidia-smi command, it worked and showed a table but said “no running process found”, so is this a successful installation?

Not necessarily, previously only the part of the driver responsible for display was broken, so the rest was working. Might be that this is still broken. Please create a new nvidia-bug-report.log
If you have internet connection, you can use pastebinit to upload it from console.

  • install pastebinit (sudo apt install pastebinit)
  • unzip logfile (gunzip nvidia-bug-report.log.gz)
  • upload logfile (pastebinit -i nvidia-bug-report.log)
  • note down and post the url you’re given

Thanks for your reply. I’ve done as you suggest. The url is:
http://paste.ubuntu.com/p/Y526bSZGtn/

Ok, the driver seems to be installed correctly and consistent but now the Xserver is crashing:

[     5.428] (EE) Backtrace:
[     5.430] (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e) [0x5630c760fe0e]
[     5.430] (EE) 1: /usr/lib/xorg/Xorg (0x5630c745e000+0x1b5b79) [0x5630c7613b79]
[     5.430] (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f97c161e000+0x11390) [0x7f97c162f390]
[     5.430] (EE) 3: /lib/x86_64-linux-gnu/libc.so.6 (0x7f97c1254000+0x14dafa) [0x7f97c13a1afa]
[     5.430] (EE) 4: /usr/lib/nvidia-430/libnvidia-glcore.so.430.26 (0x7f97bd857000+0x118b139) [0x7f97be9e2139]
[     5.430] (EE) 5: /usr/lib/nvidia-430/libnvidia-glcore.so.430.26 (0x7f97bd857000+0x118b29d) [0x7f97be9e229d]
[     5.430] (EE) 6: /usr/lib/nvidia-430/libnvidia-glcore.so.430.26 (0x7f97bd857000+0xe74848) [0x7f97be6cb848]
[     5.430] (EE) 7: /usr/lib/x86_64-linux-gnu/xorg/extra-modules/libglxserver_nvidia.so (0x7f97bb36a000+0x8784a2) [0x7f97bbbe24a2]

Which is a bit odd since the crash looks like a know issue of the 430 driver that only happens when some non-glvnd compat libs are installed which should not be the case. Did you at any time use a 430 .run installer? Otherwise the 16.04 430 driver package is incorrectly packaged.

Yes, I tried to run a 430 runfile downloaded from nvidia official website a longtime (about one and a half month) ago. Is that the reason? What should I do to fix it?
Actually, the driver once worked well at that time, but I also tried pretty hard then. I once tried ppa install 418 and 430, they both result in normal work but login loop, then I purge them and tried 430 runfile which fail(forget about detail). Finally, I tried 418 runfile with --no-opengl-files option, the X service worked well but nvidia-smi showed no process and nvidia-settings can’t load from any available system. Then I try to install cuda through ppa, and it succeeded, all came alive. Few days ago my kernel upgraded itself automatically so the nvidia driver broke down. I was so naive that I immediately uninstalled almost everything (nvidia driver, cuda). Now I’m suspecting that I didn’t accomplish a complete uninstallation, which results in the problems nowadays.

Might be, during install, there’s a question about installing non-glvnd compat libs, if you chose Y there, that broke it.
Please post the output of
ls -l /usr/lib/libGL*
and
ls -l /usr/lib/nvidia-430/libGL*

Thanks for your reply. The result is too long to be typed out manually, so I take a picture and attach it to this reply. Sorry about the inconvenience.
Besides, I don’t recall there exists such an option during any installation I’ve been through. (Of course maybe I just forget)

This actually looks like a mistake in the packaging, nothing to do with your previous installation attempts. Maybe this can be fixed manually.
cd /usr/lib/nvidia-430
sudo rm libGL.so.1
sudo ln -s libGL.so.1.7.0 libGL.so.1
then reboot.

Thanks for your reply, this helps a little but still can’t solve the problem completely.
The good thing is that graphics console comes alive, nvidia-smi works fine and shows processes correctly.
The bad thing is that there exists a login loop as I suspected before.