Newest and beta Linux driver causing segmentation fault (core dumped) on all skylake platforms

I just recently upgraded to skylake. 6700K, maximus viii hero with the latest bios, and a 980 ti. Any program that is GUI related crashes with a segmentation fault when exited. Console based applications do not.

It seems to be hitting users on systems that use nvidia hardware with the newest official nvidia drivers. When using older nvidia drivers the issue goes away and doesn’t exist.

Here are some links reporting about the issue.

https://bbs.archlinux.org/viewtopic.php?pid=1576426
https://bbs.archlinux.org/viewtopic.php?id=200583
https://bbs.archlinux.org/viewtopic.php?id=202545
https://bbs.archlinux.org/viewtopic.php?id=202059

a temporary fix is either to downgrade the nvidia drivers if you have that option, compile glibc without --enable-lock-elision, or try that patch in the one arch thread that’s meant to work around the lock.

i tried the latest beta drivers with both the 4.2 and the newest released 4.3 kernel and still get the fault when glibc has elision enabled.

not sure if its really nvidia or if its intel just asking here since nvidia seems to be apart of it.

I can confirm this issue, same problem here on my system after hardware upgrade to Skylake.
After a hardware upgrade (new mainboard + CPU) on an existing Linux installation, many application are crashing and core dumping with a “general protection” trap. Example:
“traps: manjaro-setting[2091] general protection ip:7f10d10397e0 sp:7ffda2e13708 error:0 in libpthread-2.22.so[7f10d1027000+18000]”

Obvioulsy applications are affected when closing (trying to remove a lock that doesn’t exist any more?). The number and frequency of core dumps effectively renders the system unusable!
My hardware is composed of a MSI H170A GAMING PRO mainboard, Intel Skylake i5-6500 CPU and GTX 950 card.
I tried various system configurations, including kernel 4.2.5, 4.3.0, and using different BIOS version that loaded microcode versions 0.x33 or 0x49. The errors persisted.
I’m using the latest Manjaro packet “non-free Nvidia 352.55” drivers.
As a last resort, I re-compiled the current glibc and lib32-glibc sources specifying “–enable-lock-elision=no” option, installed these libs/apps, rebooted, and all such errors are gone!
MSI support hotline says they are using the latest microcode (0x49) that was provided by Intel, and can’t do anything else about it for the time being.

This issue is making it so GDM and GNOME wont even function for me, what a shame.

Seg Faults are random as all hell, I just got one after upgrading a package via pacman.

/tmp/alpm_UKmKWD/.INSTALL: line 1: 21757 Segmentation fault (core dumped) usr/lib/vlc/vlc-cache-gen -f /usr/lib/vlc/plugins

This happens on Ubuntu as well, but only once an alternative has been configured to use the NVIDIA libraries for EGL. For me, the crashes mostly seem to occur when the program quits, but it also causes the lock screen on Plasma 5 to crash in such a way as to always return to the lock screen after typing the password, which makes the system rather unusable.

Some people associated with Arch Linux have collected more information about the bug here: FS#46064 : [nvidia-libgl] segfault when using TSX (__lll_unlock_elision)

There are some processors that have buggy TSX-NI (which always causes the lock elision segfaults), but the people in the thread above determined that this is a different problem that apparently began with the 352 drivers and continues up through 358.16.
nvidia-bug-report.log.gz (103 KB)

same here on openSUSE - please ask if you also need a bug report

FYI: I temporarily installed the Mesa-libgl libs over the NVidia ones to get things working for now

FYI: a temporary solution with the current NVidia drivers is the following (checked for openSUSE Leap):

Add

/lib64/noelision

to /etc/ld.so.conf

from Bug 957061 – [nvidia binary] segfault when using TSX (__lll_unlock_elision) (affects plasma5 screen unlocking)

See also these 2 bugs for Debian.

libegl1-nvidia: Programs crash due to elisian-unlock on skylake processor with nvidia driver 352.63-1 (experimental)
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=807244

libc6: lock elision hazard on Intel Broadwell and Skylake
https://bugs.debian.org/cgi-bin/bugreport.cgi?archive=no&bug=800574