Consistent lockup on resume from suspend with 355.11

I normally suspend/resume my desktop when I’m not using it. Since upgrading to 355.06, I’ve seen the system lockup on resume every single time.

This is with a GTX960 and I get the following Xid, which shows up on the screen on top of white noise

62, 88e1(903c) 00000000 00000000

Kernel is 4.1.3
nvidia-bug-report.log.gz (135 KB)

Your Xid errors are incomplete.

It should look something like this.

Also it would be better if you appended nvidia-bug-report

I know what they should look like, but I’m also stuck writing stuff down that I’m reading off the screen of a locked up machine.

Can you take a pic with your phone? Also you can run the bug report after rebooting.

I’ve added the nvidia-bug-report for whatever it’s worth. I’ll try and take a photo next time but it doesn’t communicate anything meaningful that I didn’t already type - the only information I left out was the PCI bus address.

Try to suspend from a text terminal, e.g. Ctrl + Alt + F2 → login → sudo su -echo mem > /sys/power/state

Same problem on an ASUS ROG G751J with a Geforce 970M on Ubuntu Wily. Also unable to pull logs due to having to hard shut down the machine to recover.

You can run the log generator after rebooting. The purpose of generating the log is to determine the system configuration. Sometimes the Xorg and kernel logs from the previous session are saved and still available.

You don’t seem to be an actual developer? Where can I file a real bug report?

http://nvidia.custhelp.com/app/answers/detail/a_id/44/~/where-can-i-get-support-for-linux-drivers%3F

Mind that they never reply to the e-mails sent to the addresses mentioned on this page.

Same laptop model. Gentoo Linux with 352.30 driver all is rosy. With the 355.30 driver all is fine. With the 355.06-r1 driver a hardlock on resume and screen corruption.

I’ll file a bug report against this. It’s quite easy to switch driver versions under Gentoo after all…

It’s exactly the same with 355.11 - so now the official primary release driver is unusable :-/

I’ve been seeing a fair few more near-hangs (display doesn’t come back up, hard drive light blinks ineffectually a few times a minute) on resume from suspend since going to 355 too.

philipl, I see intel_iommu=on iommu=on try iommu off and then test…

Actually, I just reverted to 352 from 355 and I’m still getting the lockups on resume, so maybe they’re due to going from kernel 3.16 to 3.19, which I did around the same time …

I tried turning it off but it didn’t help - same result. It also stopped my NIC working, which surprised me.

Hm. Still getting occasionally lockups on resume from suspend with every combination I’ve tried of 3.16/3.19 and 352/355. Between this and the “can’t downclock when three displays are connected,” I have to say I’m not very happy with the extent to which having switched from Intel Mesa to nVidia binary has impacted my power consumption, given how many basic power saving capabilities now no longer work reliably …

Any idea which logfiles I ought to check to get an idea of what’s going on here? Is any of this stuff being targeted for a fix? Xorg/Wayland-related?

Think I’ve isolated this – checked one of the Xorg logs after a failed resume and it seemed to be attempting to use my motherboard Intel graphics on resume even though it’s explicitly disabled and never loaded by the system otherwise. Not sure but I think it may have to do with my having added GRUB_GFXPAYLOAD_LINUX=keep to /etc/default/grub and installing v86d to get graphical boot back after having switched to nvidia…

I reverted to 352.30 and don’t have the problem, but I do on 352.41

In any case, I’ve confirmed my problem is gone by removing v86d. Not sure why that would only occasionally cause the machine to try to enable the Intel graphics on resume, but good to know it’s not an Nvidia issue.