Asymmetrical and variable h.264 end-to-end latency

rel_28.2.1

The sender TX2 is generating an h.264 stream using

gst-launch-1.0 v4l2src device=/dev/video0 ! 'video/x-raw, format=UYVY, width=1920, height=1080, framerate=60/1' ! nvvidconv ! 'video/x-raw(memory:NVMM), format=NV12' ! omxh264enc ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=x.y.z.a port=n sync=false  ttl-mc=6

A receiver TX2i is displaying the stream using

gst-launch-1.0 udpsrc port=n address=x.y.z.a ! tsdemux ! queue max-size-time=0 max-size-bytes=0 max-size-buffers=0 ! h264parse ! omxh264dec ! nvoverlaysink  sync=false

Where IGMP group address and port are masked

The latency is measured using a Raspberry Pi with two light sensors that are driven by a flashing block in the source video and then the display monitor. This provides an reasonably accurate measurement every second.

If the receiver is started before the sender then the latency is small and stable, at about 160 milliseconds

If the sender is started before the receiver then the latency starts at about 280 milliseconds and slowly decreases to 160 milliseconds, over about a minute.

I have tried setting iframeinterval=10 and insert-sps-pps=true, without effect.
Changing the queue parameters and even removing the queue has no effect.

Clocks are default

This post is probably relevant? https://devtalk.nvidia.com/default/topic/1032771/jetson-tx2/no-encoder-perfomance-improvement-before-after-jetson_clocks-sh/post/5255605/#5255605

Hi,
You may set ‘async=false’ in udpsink and nvoverlaysink and test again.

The patch enables encoder to run in max clock always. Should also help the case.

There can be certain delay from the source, so suggest compare with videotestsrc.

async=false has no impact.

I do not see how changes to the encoder would be helpful in this particular situation.
The excess delay occurs when decoder joins an existing stream.
The encoder is then in steady state and generating frames at a constant rate.
The observed delay does not depend on how long the encoder has been running, but only on how long the decoder has been running.
The same behavior is seen when the decoder is repeated restarted over several hours.

The problem appears to lie in the decoder behavior when the first packets it sees fall in the middle of the stream. One would expect it to simply discard the incomplete data and start on the first iframe.

Hi,
UDP may not be reliable. You may try RTSP in TCP mode. A user has share a path to modify test-mp4:
https://devtalk.nvidia.com/default/topic/1062748/deepstream-sdk/nvvideoconvert-crashes-on-rtsp-input-src-crop-x-y-w-h-pipeline/post/5387790/#5387790

You may apply it to test-launch and try.

I require UDP multicast.
RTCP over TCP is not an option

Hi,
In UDP protocol, if it has not received codec specific data(SPS/PPS in h264), it drops the received stream untils codec data is well received. This may introduce extra latency.