BUG: nvarguscamerasrc Segmentation fault

Hi,

this simple code snippet crashes with a Segmentation fault (tested on JetPack 4.3):

https://paste.gnome.org/pw1vudabr

a quick fix would be highly appreciated.

Stacktrace:

#0  0x0000007fb6f233f4 in find_notify (object=0x7fa8ff850e, object=0x7fa8ff850e, data=0x0, notify=0x0, match_notify=0, quark=2577) at gstminiobject.c:369
#1  0x0000007fb6f233f4 in gst_mini_object_set_qdata (object=0x7fa8ff850e, quark=2577, data=0x0, destroy=0x0) at gstminiobject.c:672
#2  0x0000007fb7853948 in  () at /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstnvarguscamerasrc.so
#3  0x0000007fb75bb98c in  () at /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0
#4  0x0000007fa8ff8598 in  ()

hello arne.caspari,

thanks for sharing simple code-snippet for reference, we’re able to reproduce the issue locally.
we’ll check this internally.

hello arne.caspari,

I’ve revert the code snippet that trigger segmentation fault,
please replace the argus library with devtalk1070455_Feb10.tar.gz for your temporary solution.
we’ll also investigate the fix, thanks
devtalk1070455_Feb10.tar.gz (29 KB)

Hello Jerry,

I am having the same problem as Caspari, I tried your temporary solution. Indeed nvarguscamerasrc no loger segfaults but it seems I am getting black images (empty buffers) from gstreamer after I set the pipeline again to play. The pipeline I am using for testing is:

nvarguscamerasrc ! nvvidconv ! videoconvert ! video/x-raw, format=RGB ! appsink

Is there anything else I should try ?

hello ricardo.juarez,

I’ve tried locally to confirm gst pipeline and also the python script in comment #1 works with temporary approach.
could you please share the failure messages, may I also know what’s the sensor you’re used.
thanks

So this is very similar if not the exact issue I complained about in R32.1. It seems it still exists in R32.3.1.

What is the time table for a fix for this?

EDIT: Note that in R32.1 not only did nvargus-daemon segfault and/or hang, but there were spurious bus messages that would occur if you restarted a pipeline using GStreamer.

One work around I found is to recreate the pipeline from scratch on every retry (that sorta works).

Hi All
Please have a try the attached lib to fix the problem.
libgstnvarguscamerasrc_tx2_xaver.so.txt (81.5 KB)

Hello Shane,

It looks like this last update fixes the problem for me. When are you planning to officially release this version of the argus camera plugin ?

Next release should include this fixed.

ShaneCCC is this fix good for the TX2? I would assume so.

The patch for TX2 is in below link.
The patch for both of them will include in next release.

https://devtalk.nvidia.com/default/topic/1062196/jetson-tx2/nvargus-daemon-freeze-hang-on-pipeline-stop-on-r32-1/post/5429629/#5429629

So I am using r32.3.1 how do I determine if this patch is in my system, and if not in my system how do I get it

What type of debugging do I need to do to determine why setting my pipeline to NULL is still hanging.

Also I put in code to kill -9 my app and now I get Argus: Errors if I restart my app too quickly, Argus errors in post today.

So does sudo apt update; sudo apt upgrade pull in these new fixes?

Thanks
Terry

so tx2 r32.3.1 I get this
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: === VideoSystem.notStripped[6771]: Connection closed (7FA53811D0)=== VideoSystem.notStripped[6771]: WARNING: CameraProvider was not destroyed before client connection terminated.=== VideoSystem.notStripped[6771]: The client may have abnormally terminated. Destroying CameraProvider…=== VideoSystem.notStripped[6771]: CameraProvider destroyed (0x7fa078f780)=== VideoSystem.notStripped[6771]: WARNING: Cleaning up 1 outstanding requests…=== VideoSystem.notStripped[6771]: WARNING: Cleaning up 1 outstanding streams…SCF: Error InvalidState: 4 buffers still pending during EGLStreamProducer destruction (propagating from src/services/gl/EGLStreamProducer.cpp, function freeBuffers(), line 305)
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: SCF: Error InvalidState: (propagating from src/services/gl/EGLStreamProducer.cpp, function ~EGLStreamProducer(), line 50)
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: === VideoSystem.notStripped[6771]: WARNING: Cleaning up 1 outstanding stream settings…=== VideoSystem.notStripped[6771]: WARNING: Cleaning up 1 outstanding sessions…(NvCameraUtils) Error InvalidState: Mutex not initialized (/dvs/git/dirty/git-master_linux/camera/core_scf/src/services/gl/EGLStreamProducer.cpp:497) (in Mutex.cpp, function lock(), line 79)
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: SCF: Error BadParameter: Buffer is not pending (in src/services/gl/EGLStreamProducer.cpp, function presentBufferInternal(), line 501)
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: (NvCameraUtils) Error InvalidState: Mutex has not been initialized (in Mutex.cpp, function unlock(), line 88)
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: SCF: Error BadParameter: (propagating from src/services/gl/EGLStreamProducer.cpp, function presentBuffer(), line 486)
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: SCF: Error BadParameter: (propagating from src/components/CaptureContainerImpl.cpp, function returnBuffer(), line 447)
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: SCF: Error BadParameter: (propagating from src/components/stages/BufferReturnStage.h, function doExecute(), line 43)
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: SCF: Error BadParameter: Sending critical error event (in src/api/Session.cpp, function sendErrorEvent(), line 990)
Mar 31 13:16:17 BaseSystem_0_5 nvargus-daemon[6724]: (NvCameraUtils) Error InvalidState: Mutex not initialized (/dvs/git/dirty/git-master_linux/camera/core_scf/src/services/gl/EGLStreamProducer.cpp:399) (in Mutex.cpp, function lock(), line 79)
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.709000] host1x 13e10000.host1x: cdma_handle_timeout: timeout: 49 (15700000.vi_2) clientid 390311, HW thresh 239841, done 239842
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.721066] host1x 13e10000.host1x: cdma_handle_timeout: timeout: 48 (15700000.vi_1) clientid 390311, HW thresh 239848, done 239848
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.733222] host1x 13e10000.host1x: cdma_handle_timeout: timeout: 47 (15700000.vi_0) clientid 390311, HW thresh 242617, done 242617
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745298] ---- mlocks ----
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745326]
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745329] ---- syncpts ----
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745340] id 2 (disp_a) min 199687 max 199687 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745346] id 3 (disp_b) min 210233 max 210233 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745352] id 5 (disp_d) min 40 max 40 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745357] id 6 (disp_e) min 1 max 1 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745362] id 7 (disp_f) min 1 max 1 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745367] id 8 (vblank0) min 21493670 max 0 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745374] id 11 (dsi) min 120 max 0 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745379] id 12 (vblank1) min 21441121 max -4 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745396] id 23 (gp10b_507) min 2516690 max 2516690 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745401] id 24 (gp10b_506) min 490 max 490 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745407] id 25 (gp10b_505) min 1478384 max 1478384 refs 1 (previous client : gp10b_505)
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745414] id 28 (gp10b_504) min 6 max 6 refs 1 (previous client : )
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745426] id 35 (15600000.isp_nvargus-daemon_0) min 479708 max 479708 refs 3 (previous client : 15600000.isp_nvargus-daemon_0)
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745432] id 36 (15600000.isp_nvargus-daemon_1) min 247352 max 247352 refs 3 (previous client : 15600000.isp_nvargus-daemon_1)
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745438] id 37 (15600000.isp_nvargus-daemon_2) min 250819 max 250819 refs 3 (previous client : 15600000.isp_nvargus-daemon_2)
Mar 31 13:16:18 BaseSystem_0_5 kernel: [357375.745443] id 38 (15600000.isp_nvargus-daemon_3) min 239842 max 239842 refs 3 (previous client : 15600000.isp_nvargus-daemon_3)
Mar 31 13:16:18 BaseSystem_0_5 systemd[1]: nvargus-daemon.service: Main process exited, code=killed, status=11/SEGV
Mar 31 13:16:18 BaseSystem_0_5 systemd[1]: nvargus-daemon.service: Failed with result ‘signal’.
Mar 31 13:16:18 BaseSystem_0_5 systemd[1]: nvargus-daemon.service: Service hold-off time over, scheduling restart.
Mar 31 13:16:18 BaseSystem_0_5 systemd[1]: nvargus-daemon.service: Scheduled restart job, restart counter is at 1.
Mar 31 13:16:18 BaseSystem_0_5 systemd[1]: Stopped Argus daemon.
Mar 31 13:16:18 BaseSystem_0_5 systemd[1]: Started Argus daemon.
Mar 31 13:17:01 BaseSystem_0_5 CRON[6930]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)

after consulting with Lumenera if it is there camera driver problem my gstreamer app is not reliable. When is nvargus-daemon going to be fixed?

any way to get any information on this. How do I determine if the fix is in my system, How should the fix get to my system?

How can I debug why setting gst_element_set_state to NULL hangs.

How and when will this be fixed, it has been over a YEAR!

Terry

any new information?

Terry

Hi terrysu50z,

Please help to open a new topic for your issue. Thanks