NVENC fails on latest ArchLinux NVIDIA driver 430.26

Bug occurs with latest nvidia packages on ArchLinux:
nvidia-dkms 430.14-6
nvidia-utils 430.14-1
lib32-nvidia-utils 430.14-1
opencl-nvidia 430.14-1

When using NVENC to encode video with h264 codec using FFMPEG or OBS-Studio, the program crashes with segfault with latest mentioned packages installed. This bug does not occur after downgrading the mentioned packages to version 418.74-1 .

Note that though I use the “linux-hardened” kernel by default, I’ve tested this with the stock linux kernel too, and the situation is the same.

EDIT: Forgot to mention system specs, which may be quicker to parse some brief info:
stephen@ArchLighthouse

OS: Arch Linux x86_64
Host: DX4870
Kernel: 5.0.19.a-1-hardened
Uptime: 22 mins
Packages: 1374 (pacman)
Shell: bash 5.0.7
Resolution: 1920x1080, 1920x1080
WM: awesome
Theme: Adwaita [GTK2/3]
Icons: Adwaita [GTK2/3]
CPU: Intel i7-3770 (8) @ 3.900GHz
GPU: NVIDIA GeForce GTX 660 Ti
Memory: 824MiB / 32116MiB

EDIT2: Just realized you can attach the nvidia-bug-report to the post, so I moved the link to my site where it was uploaded to this edit note. [url]https://seodisparate.com/static/uploads/nvidia-bug-report_2019-05-31.log.gz[/url]
nvidia-bug-report.log.gz (262 KB)

Do you have a backtrace of the segfault?

To get some more detail in the backtrace, I built ffmpeg from it’s git repo after checking out tag “n4.1.3”.

[url]https://seodisparate.com/static/uploads/backtrace_ffmpeg_nvenc_2019-06-01.txt[/url]

EDIT: I can confirm that ffmpeg can successfully process the test video when using the downgraded packages.
backtrace_ffmpeg_nvenc_2019-06-01.txt (5.53 KB)

Just tested obs/ffmpeg again with NVENC with the latest NVIDIA drivers on ArchLinux, anything using NVENC still crashes with segmentation fault. Note I’m using an NVIDIA GeForce GTX 660Ti graphics card, maybe NVENC support for this card has been dropped?

nvidia-dkms 430.26-1
nvidia-utils 430.26-1
lib32-nvidia-utils 430.26-1

Can confirm: ArchLinux, GTX 760, 430.26.

I am also having the exact same problem. Completely up to date on Arch. Card is a GTX 770. Hitting the record or stream button with NVENC selected causes the program (OBS) to close immediately with no entry into the logs.

Hi.
Thanks for reporting the issue and sorry for an inconvenience caused.
We are able to reproduce the issue internally and will look into this.

For future reference, this is tracked as 200528432.

Thanks.

I can reproduce the OBS-based issue on OpenSUSE Tumbleweed with OBS Studio from both Packman repository and AppImage ([url]https://github.com/probonopd/obs-studio/releases/download/continuous/OBS_Studio-987ccdd-x86_64.AppImage[/url]).

Choose NVENC in Recording, start recording and it segfaults.

Driver version: 430.26-14.1, from NVIDIA repos for Tumbleweed. Card: GeForce GTX 760

mandar_godse,
Try the AppImage above on any Linux distro with 430.* driver and 760-770 cards, you should be able to reproduce with at least some, judging by answers above.

UPD. sorry, I read that as “we are not able to reproduce”. Disregard.

Any update on this? Kind of crazy that it’s still broken.

Please verify with the latest 435 release driver.

https://devtalk.nvidia.com/default/topic/1060977/announcements-and-news/-linux-solaris-and-freebsd-driver-435-17-beta-release-/

Thanks.

I just built some ArchLinux NVIDIA packages after editing their PKGBUILDs, namely:
nvidia-dkms (and nvidia from same split ArchLinux package)
nvidia-utils (and opencl-nvidia from same split ArchLinux package)
lib32-nvidia-utils (and lib32-opencl-nvidia from same split ArchLinux package)

I mainly changed the pkgver’s to “435.17”, and pkgrel to “1”, and had to change a few more things in the nvidia-utils PKGBUILD to get it to work (“nvidia_icd.json.template” no longer exists, and the PKGBUILD tries to patch it with sed, and it also tries to install it in the install section of the PKGBUILD, so I replaced “nvidia_icd.json.template” with “nvidia_icd.json” for that install line).

After installing the updated packages with the new drivers, I tested encoding with nvenc and decoding with nvdec.

obs-studio does not crash anymore when encoding, and so does ffmpeg. mpv can successfully display video decoding with nvdec as well.

I think it’s safe to say the new drivers fix this issue.

Thanks for the fix

I am also having the exact same problem.

Ubuntu 18.04.3
ffmpeg 4.2
Driver Version: 435.21
CUDA Version: 10.1
Nvenc version 9.1

Capture Device: DeckLink Mini Recorder 4K
GPU: NVidia GeForce GTX 1660

encoding crushes without a log to ffmpeg.

[decklink @ 0x55dc42b4ac00] Frame received (#17250) - Valid (5529600B) - QSize 0.000000MB
frame=17251 fps= 25 q=26.0 size=  420858kB time=00:11:29.95 bitrate=4997.0kbits/s speed=   1x
frame=17264 fps= 25 q=26.0 size=  421170kB time=00:11:30.50 bitrate=4996.7kbits/s speed=   1x
[decklink @ 0x55dc42b4ac00] Frame received (#17275) - Valid (5529600B) - QSize 0.000000MB
frame=17277 fps= 25 q=26.0 size=  421484kB time=00:11:31.01 bitrate=4996.7kbits/s speed=   1x
[decklink @ 0x55dc42b4ac00] Frame received (#17300) - Valid (5529600B) - QSize 63.378754MB
[decklink @ 0x55dc42b4ac00] Frame received (#17325) - Valid (5529600B) - QSize 195.402374MB
[decklink @ 0x55dc42b4ac00] Frame received (#17350) - Valid (5529600B) - QSize 327.425995MB
[decklink @ 0x55dc42b4ac00] Frame received (#17375) - Valid (5529600B) - QSize 459.449615MB
[decklink @ 0x55dc42b4ac00] Frame received (#17400) - Valid (5529600B) - QSize 591.473236MB
[decklink @ 0x55dc42b4ac00] Frame received (#17425) - Valid (5529600B) - QSize 723.496857MB
[decklink @ 0x55dc42b4ac00] Frame received (#17450) - Valid (5529600B) - QSize 855.520477MB
[decklink @ 0x55dc42b4ac00] Frame received (#17475) - Valid (5529600B) - QSize 987.544098MB
[decklink @ 0x55dc42b4ac00] Decklink input buffer overrun!

and syslog:

kernel: [ 1240.080358] NVRM: GPU at PCI:0000:01:00: GPU-6caba61c-b997-efe4-1f42-012b974eadb5
kernel: [ 1240.080362] NVRM: GPU Board Serial Number:
kernel: [ 1240.080369] NVRM: Xid (PCI:0000:01:00): 31, pid=2196, Ch 00000038, intr 00000000. MMU Fault: ENGINE NVENC0 HUBCLIENT_NVENC0 faulted @ 0x0_00001000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ