Asymmetrical and variable h.264 end-to-end latency

richardsearle · November 27, 2019, 9:15pm

rel_28.2.1

The sender TX2 is generating an h.264 stream using

gst-launch-1.0 v4l2src device=/dev/video0 ! 'video/x-raw, format=UYVY, width=1920, height=1080, framerate=60/1' ! nvvidconv ! 'video/x-raw(memory:NVMM), format=NV12' ! omxh264enc ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=x.y.z.a port=n sync=false  ttl-mc=6

A receiver TX2i is displaying the stream using

gst-launch-1.0 udpsrc port=n address=x.y.z.a ! tsdemux ! queue max-size-time=0 max-size-bytes=0 max-size-buffers=0 ! h264parse ! omxh264dec ! nvoverlaysink  sync=false

Where IGMP group address and port are masked

The latency is measured using a Raspberry Pi with two light sensors that are driven by a flashing block in the source video and then the display monitor. This provides an reasonably accurate measurement every second.

If the receiver is started before the sender then the latency is small and stable, at about 160 milliseconds

If the sender is started before the receiver then the latency starts at about 280 milliseconds and slowly decreases to 160 milliseconds, over about a minute.

I have tried setting iframeinterval=10 and insert-sps-pps=true, without effect.
Changing the queue parameters and even removing the queue has no effect.

Clocks are default

This post is probably relevant? https://devtalk.nvidia.com/default/topic/1032771/jetson-tx2/no-encoder-perfomance-improvement-before-after-jetson_clocks-sh/post/5255605/#5255605

DaneLLL · November 28, 2019, 3:01am

Hi,
You may set ‘async=false’ in udpsink and nvoverlaysink and test again.

The patch enables encoder to run in max clock always. Should also help the case.

There can be certain delay from the source, so suggest compare with videotestsrc.

richardsearle · December 2, 2019, 6:11pm

async=false has no impact.

I do not see how changes to the encoder would be helpful in this particular situation.
The excess delay occurs when decoder joins an existing stream.
The encoder is then in steady state and generating frames at a constant rate.
The observed delay does not depend on how long the encoder has been running, but only on how long the decoder has been running.
The same behavior is seen when the decoder is repeated restarted over several hours.

The problem appears to lie in the decoder behavior when the first packets it sees fall in the middle of the stream. One would expect it to simply discard the incomplete data and start on the first iframe.

DaneLLL · December 12, 2019, 8:32am

Hi,
UDP may not be reliable. You may try RTSP in TCP mode. A user has share a path to modify test-mp4:
https://devtalk.nvidia.com/default/topic/1062748/deepstream-sdk/nvvideoconvert-crashes-on-rtsp-input-src-crop-x-y-w-h-pipeline/post/5387790/#5387790

You may apply it to test-launch and try.

richardsearle · December 13, 2019, 8:38pm

I require UDP multicast.
RTCP over TCP is not an option

DaneLLL · December 16, 2019, 2:00am

Hi,
In UDP protocol, if it has not received codec specific data(SPS/PPS in h264), it drops the received stream untils codec data is well received. This may introduce extra latency.