Freezing after resume GTX 1060

Since my new GeForce 1060 GTX (Palit Super Jetstream) I notice a strange behavior after resuming. On 375 drivers I got a corrupted screen. The latest 375 and 378 drivers no start to freeze the window manager seconds later after resuming.

Got openSUSE Leap 42.2, latest Suse Kernel, KDE. Mainboard is an ASUS Z170-K, latest bios, Skylake 6700 and 16GB Ram. No problems under Win10. Also tried blacklisting the Nouveau Kernel Modules and also inserted the suggested Kernel cmd lines. Nothing helped. Resume doesn’t really work.

cu
Gargi

What’s the make and model # of the RAM you are using?

Hi! I got 2x8GB PC2666 CL15 Kingston HyperX Fury RAM.

HX426C15FBK2/16

Though the HX426C15FBK2/16 kit is qualified by Kingston for use with the ASUS/ASmobile Z170-K…

HX426C15FBK2/16
HX426C15FBK2/16
HX426C15FBK2/16

FURY Memory Black - 16GB Kit* (2x8GB) - DDR4 2666MHz CL15 DIMM
Part Number: HX426C15FBK2/16
Specs: DDR4 , 2666MHz , CL15 , 1.2V , Unbuffered , http://www.kingston.com/dataSheets/HX426C15FBK2_16.pdf
Timings: 2666MHz, 15-17-17, 1.2V

ValueRAM for ASUS/ASmobile Z170-K Motherboard
http://www.kingston.com/us/memory/search?DeviceType=2&Mfr=ASU&Line=Z170-K&Model=93074

…there are QC complaints:

HyperX Fury 16GB (2 x 8GB) DDR4 2666 RAM (Desktop Memory) CL15 XMP Black DIMM (288-Pin) HX426C15FBK2/16-Newegg.com
https://www.newegg.com/Product/Product.aspx?Item=N82E16820104573

As well the HyperX Fury’s heat spreader obscures the identity of the DIMM’s constituent chips:

A correlation between RAM and GPU problems on DDR4 Intel systems? - NVIDIA Developer Forums
https://devtalk.nvidia.com/default/topic/996607/linux/a-correlation-between-ram-and-gpu-problems-on-ddr4-intel-systems-/

I don’t yet know enough about the interrelationship between DDR4 RAM and graphics cards on Intel systems per se but IME if the RAM is flaky (for whatever reason) then nothing else is going to work right.

Have you tested those HX426C15FBK2/16s?

Memtest86+ - Advanced Memory Diagnostic Tool
http://www.memtest.org/

What’s more as RAM operating voltages continue to decline with each successive generation of memory technology (while RAM totals often escalate) the likelihood of bit-flips (which can cause random software errors, silent file corruption and security issues) in non-ECC RAM increases:

The following lecture is by Artem Dinaburg, who works for Raytheon Company, a major U.S. defense contractor:

"It turns out that non-ECC RAM is actually a security risk, as bit flips can be exploited. “Bit-squatting” from Black Hat 2011:

Mar 15, 2013
Blackhat 2011 - Bit-squatting: DNS Hijacking without exploitation - YouTube

Bitsquatting: DNS Hijacking without exploitation
http://dinaburg.org/bitsquatting.html

“…As the graph above shows, ECC RAM has a much lower failure rate than non-ECC RAM. The ~1% failure rate of the Kingston non-ECC RAM is still very, very good (which is why we primarily use Kingston), but the ECC RAM is even better at an average .24% failure rate…”

November 5, 2013
Advantages of ECC Memory - Puget Custom Computers
http://www.pugetsystems.com/labs/articles/Advantages-of-ECC-Memory-520/

Related:

May 13, 2014
ECC and REG ECC Memory Performance - Puget Custom Computers
https://www.pugetsystems.com/labs/articles/ECC-and-REG-ECC-Memory-Performance-560/

IMO anyone who is planning a new build would do well to invest in ECC RAM-supporting hardware.

Already tested the system with memtest after building parts together. No erreros found. Some months later I retested it and also no errors by memtest. Under Windows also now problems. I’m a gamer, so I do gaming sessions sometimes up th 4 hours without pausing the machine and it has no problems. Also doing normal work on it. I’m not really shure, but I also had a 960GTX for 3 months on this board and I can’t remember any problems under linux like I now get wit the Kernel version 4.x and a Pascal GPU. But also it is just the resumne after sleep. Not any other problems I under Linux with my 1060GTX.
So I not really believe it is the RAM but who ever knows :)

cu
Gargi

Then I suppose we can rule out the RAM.

I have zero experience with OpenSUSE. Someone else will have to chime in.

BTW:

If you have a problem, PLEASE read this first - NVIDIA Developer Forums
[url]If you have a problem, PLEASE read this first - Linux - NVIDIA Developer Forums

This problem persist with the latest beta drivers. I attached the bug-report. Hopefully you find a fix for it. Also got EFI with secure boot on. I’m running windows 10 pro on this machione too, so I want to keep the EFI and secure boot.

Thanks for your help!

cu
Gargi

nvidia-bug-report.log.gz (106 KB)

Hi Gargi, I think issue only hit when you resume from suspend. I see in your log :

[ 68.546697] PM: Finishing wakeup.
.
.
[ 99.990662] NVRM: Xid (PCI:0000:01:00): 56, CMDre 00000000 00000080 00000000 00000005 00000024
[ 99.990753] NVRM: Xid (PCI:0000:01:00): 56, CMDre 00000000 00000080 00000000 00000005 00000024
.
.

[ 222.786165] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
[ 224.788261] nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000917e:0:0
[ 226.788304] nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000927c:0:0

We are already tracking this issue under 200273112

Yes, absolutly right. Just when resuming. Thanks for your help! Then I wait till one of the next updates.

Gargi

I’m facing the same issue now.

Intel Core i5-6200U. GEFORCE 930MX, Linux Mint 18.1, KDE Plasma desktop, driver 381, kernel 4.4.0-75-generic.

There is no problem when using open source driver. Previously I had installed Kubuntu 17.04, and I thought it was the problem of that. I installed Linux Mint with KDE desktop and the same issue persists. Note that I have already tried 375, 378 as well.

Even if you can’t fix it yet, please at least tell some workaround, this is very annoying now.

Same issue reported on many threads :

https://devtalk.nvidia.com/default/topic/962231/linux/resume-from-suspend-freezes-system-gtx-970-arch-linux-kernel-4-4-4-7-nvidia-370-/post/5061996/#5061996
https://devtalk.nvidia.com/default/topic/982664/linux/resume-from-suspend-broke-after-upgrading-to-a-1060/
https://devtalk.nvidia.com/default/topic/977703/linux/resume-from-suspend-has-issues-since-switched-to-gtx1060/post/5027107/#5027107
https://devtalk.nvidia.com/default/topic/984297/linux/extremely-bad-performance-when-waking-up-from-hibernation-on-gtx1060/

This issue is fixed and fix will be available in next driver release.

Hi Sandip,

Good to hear that. Could you tell around when the next driver will be released?

Please note that I am not using GTX1060, but 930MX on Asus R558U laptop.

  • Intel Core i5-6200U. GEFORCE 930MX, Linux Mint 18.1, KDE Plasma desktop, driver 381, kernel 4.4.0-75-generic
  • Thanks,
    Jones

    Hi Sandip,

    I updated to 381.22, still the same issue. I also tried 375.66. Please let us know what’s going on. It’s been a while…

    Thanks,
    Jones

    I also have tested both versions and they fix the freeezings after resume for me in my case. I have only one little glitch left. After resume I have some collored dots around the icon text on my KDE desktop. Moving the icons a bit the dots disappear and no negative sideeffect. See screenshot.

    cu
    Gargi
    disorded.jpg

    disorded.jpg

    Hi Gargi,

    Thanks for the reply. But I still have the same issue.

    Sandip, could you look into this? If you need any info, let me know.

    Hi Jones, Did you see any error in logs ? Can you provide reproduction steps for this issue? Please attach nvidia bug report as soon as issue hit? What desktop env you are running kde, gnome, unity or else? It would be good if you share video recording showing this issue. Are you running ant application on desktop?

    Hi Sandip,

    I am running KDE. I am not sure how to interpret syslog, but I have saved it and uploaded in the link given below. Please check the below folder where I have also uploaded recorded video of the issue here: https://drive.google.com/open?id=0B3aF7WUKvMO4TE5fVlFobXg3RzA

    Sorry about the late reply. Please look into this asap nevertheless.

    Jones

    Hi Sandip,

    Please note that:
    -suspend works with using Intel graphics (low power) in PRIME settings
    -suspend works with using open source nouveau

    Regards,
    Jones

    There is no any nvidia related error in log. May issue is somewhere else.

    Hi Sandip/Jones,
    i’m having the same problem and as Jones coming back from Suspend/Hibernate when using Intel (Power Saving Mode) in Prime settings works well… the problem is when using the NVIDIA (Performance Mode)
    It sounds like it’s coming back but the screen remains black.
    My system info:

    ASUS laptop
    Intel Core i7-6500U CPU @ 2.50GHz x 4
    GeForce 940M/PCIe/SSE2
    Ubuntu 16.04 (4.10.0-33-generic) [64 bit]
    NVIDIA Driver version: 375.66
    X.Org 11.0

    Thanks for any guidance,
    Boris