Nvidia_drv sometime segfaults

Hi,

I’m using nvidia driver 418.43 on Arch Linux and gnome. Generally the system works fine but sometime the screen in black after boot, I think there are multiple bugs here.

The attached logs show several segfaults and errors

[drm:nv_drm_fence_context_create_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate fence signaling event
Stack trace of thread 745:
                                          #0  0x00007f192642830d _Unwind_IteratePhdrCallback (libgcc_s.so.1)
                                          #1  0x00007f19282dcfdf dl_iterate_phdr (libc.so.6)
                                          #2  0x00007f1926429456 _Unwind_Find_FDE (libgcc_s.so.1)
                                          #3  0x00007f19264259d4 uw_frame_state_for (libgcc_s.so.1)
                                          #4  0x00007f1926426bd0 uw_init_context_1 (libgcc_s.so.1)
                                          #5  0x00007f1926427a0c _Unwind_Backtrace (libgcc_s.so.1)
                                          #6  0x00007f19282afcc6 __backtrace (libc.so.6)
                                          #7  0x00005609aeae826d xorg_backtrace (Xorg)
                                          #8  0x00005609aeae83a9 n/a (Xorg)
                                          #9  0x00007f19281dde00 __restore_rt (libc.so.6)
                                          #10 0x00007f192822895d _int_free (libc.so.6)
                                          #11 0x00007f19256d0fd4 n/a (nvidia_drv.so)
                                          #12 0x00007f1925bc23c9 n/a (nvidia_drv.so)
Stack trace of thread 1498:
                                           #0  0x00007fa0e3c2dd7f raise (libc.so.6)
                                           #1  0x00007fa0e3c18672 abort (libc.so.6)
                                           #2  0x0000562247ad8b5a OsAbort (Xorg)
                                           #3  0x0000562247acf5cf FatalError (Xorg)
                                           #4  0x0000562247add3ee n/a (Xorg)
                                           #5  0x00007fa0e3c2de00 __restore_rt (libc.so.6)
                                           #6  0x00007fa0e3dfcd56 _dl_fixup (ld-linux-x86-64.so.2)
                                           #7  0x00007fa0e3e037ae _dl_runtime_resolve_xsavec (ld-linux-x86-64.so.2)
                                           #8  0x00007fa0de2d061c n/a (libglamoregl.so)
                                           #9  0x0000562247b5959e n/a (Xorg)
                                           #10 0x0000562247b29b55 n/a (Xorg)
                                           #11 0x0000562247b6e42f RRCrtcSet (Xorg)
                                           #12 0x0000562247b6ec71 ProcRRSetCrtcConfig (Xorg)
                                           #13 0x00007fa0e1619c8d n/a (nvidia_drv.so)

this happen on gnome 3.32.

Here you can find other info

the problems reported in this issue happen on the same system but using gnome 3.30, before updating to 3.32

I hope these issues will be fixed soon,

thanks!

nvidia-bug-report.log.gz (1.13 MB)
journal.log (711 KB)

I have the same problem just after update using Using KDE Plasma 5.15.3 in Arch Linux 64 bits:

NVidia Driver version 418.56

Logs:

[  949.231271] [drm:nv_drm_fence_context_create_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate fence signaling event                                                                                        
[  949.231402] [drm:nv_drm_fence_context_create_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate fence signaling event                                                                                        
[  949.231621] BUG: scheduling while atomic: Xorg/5354/0x00000003                                                                                                                                                                            
[  949.231622] Modules linked in: ccm snd_hda_codec_hdmi nls_iso8859_1 nls_cp437 vfat fat arc4 ofpart cmdlinepart iwlmvm intel_spi_platform pn544_mei i915 intel_spi spi_nor mac80211 nvidia_drm(POE) mei_phy pn544 nvidia_modeset(POE) hci m
td nfc intel_rapl iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp iwlwifi toshiba_wmi kvm_intel wmi_bmof mxm_wmi btusb kvmgt btrtl vfio_mdev mdev btbcm btintel vfio_iommu_type1 vfio nvidia(POE) bluetooth kvm u
vcvideo snd_usb_audio snd_hda_codec_idt mousedev joydev snd_hda_codec_generic i2c_algo_bit videobuf2_vmalloc irqbypass cfg80211 psmouse snd_usbmidi_lib ledtrig_audio videobuf2_memops input_leds intel_cstate videobuf2_v4l2 snd_rawmidi drm
_kms_helper videobuf2_common snd_seq_device ecdh_generic snd_hda_intel intel_uncore drm snd_hda_codec intel_rapl_perf                                                                                                                        
[  949.231651] audit: type=1701 audit(1554120682.511:1366): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=5354 comm="Xorg" exe="/usr/lib/Xorg" sig=11 res=1                                                                                 
[  949.231652]  snd_hda_core pcspkr snd_hwdep lpc_ich snd_pcm alx ipmi_devintf intel_gtt ipmi_msghandler snd_timer agpgart mei_me mdio syscopyarea snd sysfillrect sysimgblt toshiba_acpi fb_sys_fops i2c_i801 soundcore sparse_keymap toshib
a_bluetooth mei ie31200_edac rfkill industrialio evdev pcc_cpufreq mac_hid wmi battery ac vmmon(OE) vmw_vmci videodev media crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto dm_crypt algif_skcipher af_alg hi$
_generic usbhid hid uas usb_storage dm_mod sr_mod sd_mod cdrom crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw atkbd libps2 ahci libahci aesni_intel xhci_pci libata xhci_hcd aes_x86_64 crypto_simd scsi_mod crypt$
 ehci_pci glue_helper ehci_hcd i8042 serio
[  949.231681] Preemption disabled at:          
[  949.231682] [<0000000000000000>]           (null)
[  949.231685] CPU: 7 PID: 5354 Comm: Xorg Tainted: P        W  OE     5.0.5-arch1-1-ARCH #1
[  949.231686] Hardware name: TOSHIBA SATELLITE L50-A-1EH/VG10S, BIOS 1.90 09/19/2014
[  949.231686] Call Trace:                  
[  949.231695]  dump_stack+0x5c/0x80    
[  949.231700]  __schedule_bug.cold.13+0x38/0x51
[  949.231703]  __schedule+0x5d9/0x8b0     
[  949.231714]  ? record_times+0x16/0xb0
[  949.231716]  schedule+0x32/0x80     
[  949.231719]  schedule_timeout+0x311/0x4a0       
[  949.231720]  ? resched_curr+0x23/0xd0
[  949.231721]  ? check_preempt_curr+0x7a/0x90
[  949.231723]  wait_for_common+0x15f/0x190
[  949.231724]  ? wake_up_q+0x70/0x70                                                                                                                                                                                                       
[  949.231728]  do_coredump+0x35d/0xe98    
[  949.231732]  ? generic_perform_write+0x133/0x1c0
[  949.231736]  get_signal+0x3cf/0x6e0         
[  949.231738]  ? page_fault+0x8/0x30             
[  949.231741]  do_signal+0x36/0x650
[  949.231743]  ? _raw_spin_unlock_irqrestore+0x20/0x40
[  949.231745]  ? force_sig_fault+0x59/0x80                                                                                                                                                                                                 
[  949.231746]  ? page_fault+0x8/0x30                                                                                                                                                                                                       
[  949.231748]  exit_to_usermode_loop+0xbf/0xe0                                                                                                                                                                                             
[  949.231750]  prepare_exit_to_usermode+0x64/0x90                                                                                                                                                                                          
[  949.231751]  retint_user+0x8/0x8                                                                                                                                                                                                         
[  949.231754] RIP: 0033:0x7fee2fa6630d                                                                                                                                                                                                     
[  949.231755] Code: 58 b8 ff ff ff ff 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 1f 40 00 48 8b 4f 20 48 3b 0d 1d 6d 00 00 4c 8b 47 28 0f 84 9b 00 00 00 <48> 89 0d 0c 6d 00 00 48 8d 0d d5 6d 00 00 4c 89 05 86 6d 00 00 4c                      
[  949.231756] RSP: 002b:00007ffe89866e70 EFLAGS: 00013212                                                                                                                                                                                  
[  949.231757] RAX: 0000562b1591a040 RBX: 00007ffe89866f20 RCX: 0000000000000058                                                                                                                                                            
[  949.231757] RDX: 00007ffe89866fb0 RSI: 0000000000000040 RDI: 00007ffe89866f20                                                                                                                                                            
[  949.231758] RBP: 0000000000000058 R08: 0000000000000000 R09: 0000000000000000                                                                                                                                                            
[  949.231759] R10: 0000562b1591a000 R11: 0000000000000000 R12: 00007fee31a1d000                                                                                                                                                            
[  949.231760] R13: 00007ffe89867250 R14: 0000000000000000 R15: 00007fee31a1e120       

cpuinfo:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core™ i7-4700MQ CPU @ 2.40GHz
stepping : 3
microcode : 0x25
cpu MHz : 917.151
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb st
ibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts flush_l1d
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips : 4791.82
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:


uname -a:

Linux hell.dfer.io 5.0.5-arch1-1-ARCH #1 SMP PREEMPT Wed Mar 27 17:53:10 UTC 2019 x86_64 GNU/Linux

on my system the problem seems solved after updating to 418.56, no more login issues since 21/03

I spoke too soon, the issue happened again two times in a day
journal.log (319 KB)
nvidia-bug-report.log.gz (1.13 MB)

The call trace is different, but still related
Usually happens late (had my PC running for 3 days before crash), today happened right after boot.

Arch linux with gnome, maybe related with wine and fullscreen games.

Last one:

[  217.093121] [drm:nv_drm_fence_context_create_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate fence signaling event
[  217.093204] [drm:nv_drm_fence_context_create_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate fence signaling event
[  217.093527] BUG: scheduling while atomic: Xorg/926/0x00000003
[  217.093528] Modules linked in: fuse rfcomm cmac ccm bnep jc42 nls_iso8859_1 nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal arc4 intel_powerclamp rtl8723be coretemp btcoexist kvm_intel rtl8723_common kvm rtl_pci rtlwifi mac80211 irqbypass btusb btrtl btbcm joydev btintel mousedev cfg80211 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio bluetooth snd_hda_codec_hdmi snd_soc_rt5640 uvcvideo snd_hda_intel ecdh_generic snd_hda_codec i915 crct10dif_pclmul snd_soc_rl6231 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_hda_core snd_soc_core rfkill videodev ofpart snd_compress ac97_bus iTCO_wdt iTCO_vendor_support media r8169 snd_pcm_dmaengine cmdlinepart snd_hwdep crc32_pclmul intel_spi_platform intel_spi snd_pcm snd_timer i2c_algo_bit spi_nor ghash_clmulni_intel snd mei_hdcp intel_gtt realtek aesni_intel mtd mei_me soundcore lpc_ich i2c_i801 aes_x86_64 mei libphy rtsx_pci_ms memstick ie31200_edac crypto_simd cryptd psmouse glue_helper mxm_wmi wmi pcspkr
[  217.093552]  intel_cstate input_leds intel_uncore evdev mac_hid intel_rapl_perf pcc_cpufreq battery ac sg crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 dm_mod sd_mod rtsx_pci_sdmmc mmc_core serio_raw ahci atkbd libps2 libahci libata xhci_pci crc32c_intel ehci_pci scsi_mod xhci_hcd ehci_hcd rtsx_pci i8042 serio nvidia_drm(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart nvidia_uvm(OE) nvidia_modeset(POE) nvidia(POE) ipmi_devintf ipmi_msghandler
[  217.093565] Preemption disabled at:
[  217.093566] [<0000000000000000>]           (null)
[  217.093567] CPU: 3 PID: 926 Comm: Xorg Tainted: P           OE     5.1.15-arch1-1-ARCH #1
[  217.093568] Hardware name: DNS P65_P67SE                       /P65_P67SE                       , BIOS 1.03.09 03/05/2015
[  217.093569] Call Trace:
[  217.093574]  dump_stack+0x5c/0x80
[  217.093577]  __schedule_bug.cold+0x44/0x51
[  217.093579]  __schedule+0x6ee/0x8c0
[  217.093581]  schedule+0x3c/0x80
[  217.093584]  exit_to_usermode_loop+0x89/0xc0
[  217.093585]  do_syscall_64+0x152/0x190
[  217.093587]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  217.093588] RIP: 0033:0x7f0df0a0221b
[  217.093589] Code: 0f 1e fa 48 8b 05 75 8c 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 45 8c 0c 00 f7 d8 64 89 01 48
[  217.093590] RSP: 002b:00007ffc47057948 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  217.093591] RAX: 0000000000000000 RBX: 0000555f2114c300 RCX: 00007f0df0a0221b
[  217.093592] RDX: 00007ffc47057980 RSI: 00000000c02064a5 RDI: 000000000000000d
[  217.093592] RBP: 00007ffc47057980 R08: 0000555f2114bef0 R09: 0000555f2114c0f0
[  217.093593] R10: 0000555f2114bef0 R11: 0000000000000246 R12: 00000000c02064a5
[  217.093593] R13: 000000000000000d R14: 0000555f2114b550 R15: 0000555f211419e0

Old one:

июн 18 02:24:32 archlinux /usr/lib/gdm-x-session[889]: (EE) client bug: timer event16 keyboard: offset negative (-128ms)
июн 18 02:24:32 archlinux /usr/lib/gdm-x-session[889]: randr: falling back to unsynchronized pixmap sharing
июн 18 02:24:32 archlinux /usr/lib/gdm-x-session[889]: (WW) NVIDIA(0): Failed to allocate 1920x1080+0+0 head surface: out of memory.
июн 18 02:24:32 archlinux kernel: [drm:nv_drm_fence_context_create_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate fence signaling event
июн 18 02:24:32 archlinux kernel: [drm:nv_drm_fence_context_create_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate fence signaling event
июн 18 02:24:32 archlinux kernel: BUG: scheduling while atomic: Xorg/891/0x00000003
июн 18 02:24:32 archlinux kernel: Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device hid_generic usbhid hid ccm fuse rfcomm cmac bnep jc42 nls_iso8859_1 nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm arc4 irqbypass rtl8723be uvcvideo joydev mousedev videobuf2_vmalloc btcoexist rtl8723_common btusb rtl_pci snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic videobuf2_memops videobuf2_v4l2 ledtrig_audio rtlwifi i915 crct10dif_pclmul crc32_pclmul snd_soc_rt5640 videobuf2_common btrtl ghash_clmulni_intel mac80211 btbcm snd_hda_intel videodev btintel snd_soc_rl6231 aesni_intel rtsx_pci_ms r8169 bluetooth media snd_soc_core i2c_algo_bit snd_hda_codec cfg80211 ecdh_generic snd_compress ac97_bus snd_hda_core intel_gtt mei_hdcp rfkill snd_hwdep snd_pcm_dmaengine iTCO_wdt ofpart iTCO_vendor_support memstick snd_pcm mei_me cmdlinepart aes_x86_64 realtek crypto_simd snd_timer intel_spi_platform intel_spi libphy spi_nor snd i2c_i801 soundcore cryptd mtd
июн 18 02:24:32 archlinux kernel:  mei lpc_ich ie31200_edac glue_helper mxm_wmi psmouse wmi pcspkr intel_cstate input_leds intel_uncore mac_hid pcc_cpufreq evdev intel_rapl_perf ac battery sg crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 dm_mod sd_mod rtsx_pci_sdmmc ahci mmc_core libahci serio_raw atkbd libps2 libata xhci_pci scsi_mod crc32c_intel ehci_pci xhci_hcd rtsx_pci ehci_hcd i8042 serio nvidia_drm(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart nvidia_uvm(OE) nvidia_modeset(POE) nvidia(POE) ipmi_devintf ipmi_msghandler
июн 18 02:24:32 archlinux kernel: Preemption disabled at:
июн 18 02:24:32 archlinux kernel: [<0000000000000000>]           (null)
июн 18 02:24:32 archlinux kernel: CPU: 2 PID: 891 Comm: Xorg Tainted: P           OE     5.1.6-arch1-1-ARCH #1
июн 18 02:24:32 archlinux kernel: Hardware name: DNS P65_P67SE                       /P65_P67SE                       , BIOS 1.03.09 03/05/2015
июн 18 02:24:32 archlinux kernel: Call Trace:
июн 18 02:24:32 archlinux kernel:  dump_stack+0x5c/0x80
июн 18 02:24:32 archlinux kernel:  __schedule_bug.cold.13+0x38/0x51
июн 18 02:24:32 archlinux kernel:  __schedule+0x5d9/0x8b0
июн 18 02:24:32 archlinux kernel:  ? __audit_syscall_exit+0x24b/0x2b0
июн 18 02:24:32 archlinux kernel:  schedule+0x32/0x80
июн 18 02:24:32 archlinux kernel:  exit_to_usermode_loop+0x9d/0xe0
июн 18 02:24:32 archlinux kernel:  do_syscall_64+0x157/0x180
июн 18 02:24:32 archlinux kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
июн 18 02:24:32 archlinux kernel: RIP: 0033:0x7f94d5710cbb
июн 18 02:24:32 archlinux kernel: Code: 0f 1e fa 48 8b 05 a5 d1 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 d1 0c 00 f7 d8 64 89 01 48
июн 18 02:24:32 archlinux kernel: RSP: 002b:00007ffcfd04d3c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
июн 18 02:24:32 archlinux kernel: RAX: 0000000000000000 RBX: 0000555bf78db300 RCX: 00007f94d5710cbb
июн 18 02:24:32 archlinux kernel: RDX: 00007ffcfd04d400 RSI: 00000000c02064a5 RDI: 000000000000000d
июн 18 02:24:32 archlinux kernel: RBP: 00007ffcfd04d400 R08: 0000555bf78daef0 R09: 0000555bf78db0f0
июн 18 02:24:32 archlinux kernel: R10: 0000555bf78daef0 R11: 0000000000000246 R12: 00000000c02064a5
июн 18 02:24:32 archlinux kernel: R13: 000000000000000d R14: 0000555bf78da550 R15: 0000555bf78d09e0
июн 18 02:24:33 archlinux /usr/lib/gdm-x-session[889]: (EE) client bug: timer event16 keyboard: offset negative (-329ms)
июн 18 02:24:33 archlinux /usr/lib/gdm-x-session[889]: (EE) client bug: timer event16 keyboard: offset negative (-181ms)
июн 18 02:24:33 archlinux /usr/lib/gdm-x-session[889]: (EE) client bug: timer event16 keyboard: offset negative (-32ms)

Omg can someone look into this please?
Same exact error here. Happens 1/2 times I use the

light-locker-command -l

command and same with physlock, seems like this has something to do with the tty switching.
The error is gone once I disable (set to 0) “nvidia-drm.modeset=1” kernel parameter, but I get terrible screen tearing.
So I have to either compromise my security by using

dm-tool lock

which can be bypassed with a single key press, or deal with the screen tearing and pretend my computer is a Fallout terminal?

I noticed that the issue is gone when I do:

xrandr --output eDP-1-1 --set "PRIME Synchronization" 0

So I ended up writing a fork of light-locker with additional launch parameters –before-lock and –after-lock to execute shell scripts that will disable and enable PRIME sync before and after locking, respectively.
https://github.com/sooqua/light-locker/tree/develop
Hope this helps someone.

Trying it now, thanks for the tip!

Maybe this is obvious, but you need to add sleep after disabling PRIME in –before-lock script and before enabling it in –after-lock, something like:

~/.scripts/sh/before-lock.sh

#!/bin/bash
xrandr --output <b>YOUR_OUTPUT</b> --set "PRIME Synchronization" 0
sleep 1

~/.scripts/sh/after-lock.sh

#!/bin/bash
sleep 1
xrandr --output <b>YOUR_OUTPUT</b> --set "PRIME Synchronization" 1

$ light-locker --lock-after-screensaver=30 --no-late-locking --lock-on-suspend --idle-hint --before-lock “$HOME/.scripts/sh/before-lock.sh” --after-lock “$HOME/.scripts/sh/after-lock.sh”

// PKGBUILD

Out of curiosity, have you tried running xrandr using the –nograb flag without the sleeps?

--nograb
              Apply  the  modifications without grabbing the screen. It avoids
              to block other applications during the update but it might  also
              cause some applications that detect screen resize to receive old
              values.

Just tried that, Xorg still crashes without sleeps.

It was still crashing from time to time, and (after accidentally f’ing up my whole system by messing up mkinitcpio modules, trying to recover with live cd and forgetting to umount before rebooting) I think I finally found a fix.
/etc/X11/xorg.conf.d/20-nvidia.conf

Section "Module"
    Load "modesetting"
EndSection

Section "ServerLayout"
    Identifier "layout"
    Screen 0 "nvidia"
    Inactive "intel"
EndSection

Section "Device"
    Identifier "nvidia"
    Driver "nvidia"
    BusID "PCI:1:0:0"
EndSection

Section "Screen"
    Identifier "nvidia"
    Device "nvidia"
    Option "AllowEmptyInitialConfiguration" "Yes"
EndSection

Section "Device"
    Identifier "intel"
    Driver "modesetting"
    BusID "PCI:0:2:0"
    #Option "DRI" "False"
    Option "AccelMethod" "None"
    Option "NoAccel" "True"
    #Option "TearFree" "True"
    #Option "Tiling" "True"
    #Option "SwapbuffersWait" "True"
EndSection

Section "Screen"
    Identifier "intel"
    Device "intel"
EndSection

Replace with your PCI’s, of course.

Nope. Still crashes with

Failed to allocate fence signaling event

In addition to 20-nvidia.conf I created
/etc/modprobe.d/blacklist.conf

blacklist nouveau
install nouveau /bin/true

blacklist ttm
install ttm /bin/true

blacklist nvidiafb
install nvidiafb /bin/true

blacklist ipmi_msghandler
install ipmi_msghandler /bin/true

And ran
sudo mkinitcpio -P
Did not experience crashes for the past 3 days. Let’s see how it goes…

Nope, still crashes. NVIDIA please

Check this out: GitHub - Askannz/optimus-manager: A Linux program to handle GPU switching on Optimus laptops., in the Configuration section, it states you can specify commands to be run before/after load/unload modules, may or may not be helpful. I had the same issue, it even crashes when plugging/unplugging a HDMI cable.

FYI I switched to the linux-lts kernel in Arch and this problem went away (so far)

Which kernel version are you using now? Thanks!

Sorry for the delay. I’ve finally been able to try to repro this. I tried the repro scenario (a laptop with PRIME with synchronization) with both lightdm+light-locker+xfce and gdm+gnome with updated ArchLinux, including the ArchLinux-distributed linux 5.3.13.1-1 and nvidia 440.36-2. I tried booting/rebooting, logging in/out, and locking/unlocking in all these scenarios and was unable to repro in any of these them.

listsw5yds, dfmartinez, sooqua751, or any others, are you still experiencing this?

wpierce, thanks for your test,

it less frequent now but it still happen sometime.

I have an external monitor connected via HDMI, the issue happen more frequently on my laptop (thinkpad p1) if soon after powering up the laptop (before grub load or soon after grub load) I plug the power cord.