when I used cuGraphicsEGLRegisterImage() in multi thread. Every thread I have to init some buffer,It would crash with the log: pthread_mutex_lock.c:349: __pthread_mutex_lock_full: Assertion `INTERNAL_SYSCALL_ERRNO (e, __err) != EDEADLK || (kind != PTHREAD_MUTEX_ERRORCHECK_NP && kind != PTHREAD_MUTEX_RECURSIVE_NP)’ failed.
or it would stuck. if not error occur, you’d better run some times.
if I don’t init buffer, it will run ok.
the demo is in the attachment. you should change the camera url yourself.
We meet another error when testing your application:
...
Link pads from element decodebin4 to nvvconv2.
Link pads from element decodebin0 to nvvconv3.
test: eglframeconsumer.cpp:78: CUeglFrame* EGLFrameConsumer::fetch(): Assertion `eglFrame.eglColorFormat == CU_EGL_COLOR_FORMAT_RGBA' failed.
test: eglframeconsumer.cpp:78: CUeglFrame* EGLFrameConsumer::fetch(): Assertion `eglFrame.eglColorFormat == CU_EGL_COLOR_FORMAT_RGBA' failed.
Aborted (core dumped)
we would not happen this error. But you can comment the code or delete in eglframeconsumer.cpp line 77-82. It is our assertion with output format. Or you can try .mp4 file
We still cannot reproduce this issue.
The application hang few seconds after executing.
NvMMLiteBlockCreate : Block : BlockType = 260
Allocating new output: 1280x720 (x 10), ThumbnailMode = 0
OPENMAX: HandleNewStreamFormat: 3528: Send OMX_EventPortSettingsChanged: nFrameWidth = 1280, nFrameHeight = 720
Link pads from element decodebin1 to nvvconv5.
do init.
do init.
do init.
do init.
do init.
do init.
do init.
do init.
We will test this sample on another platform to double check.
Thanks.
the sample would stuck,your mean “hang” is “stuck”? I think it maybe " deadlock". this is just one of phenomenons, you can try more times, or create more threads, will happen “pthread_mutex_lock.c:349: __pthread_mutex_lock_full: Assertion `INTERNAL_SYSCALL_ERRNO (e, __err) != EDEADLK || (kind != PTHREAD_MUTEX_ERRORCHECK_NP && kind != PTHREAD_MUTEX_RECURSIVE_NP)’ failed.”
We have tried in Jetpack3.3. It also can happen this phenomenon. Or you can explain why the program stuck ? Because when the program crashed, we can restart it, but when the program stuck, we can not know if the program is abnormal and restart it.
There is a possible cause but still need further investigation:
The deadlock occurs when tid1 read the owner id when another thread just update the owner to it.
1. Two thread competing for a userspace mutex and they only differ by the lowest byte:
Ex.
tid1 = 0xaabbcc00
tid2 = 0xaabbcc01
2. Scenario
STEP1. Thread2 tid=0xaabbcc01 acquires the mutex
STEP2. Thread1 tid=0xaabbcc00 attempts to acquire mutex
STEP3. libpthread detects the mutex is not free
STEP4. libpthread invokes FUTEX_LOCK_PI from tid=0xaabbcc00
STEP5. Thread2 tid=0xaabbcc01 releases the mutex
3. Race occurs
STEP1. mutex tid is 0xaabbcc01
STEP2. CPU X can start reading byte-by-byte, the userspace mutex owner field
STEP3. CPU Y updates the 4 bytes of mutex tid to 0x00000000 ← update between CPUX read the owner field
STEP4. CPU X reads remaining bytes of new value
We are still checking if any possible solution internally. Will update with you later.
in multithread, the function cuEGLStreamConsumerConnect and cuEGLStreamConsumerDisconnect may also hang when we add and delete it frequently. In our project, we may add camera or delete it, so we have to do this to connect and disconnect.