I’m attempting to get very low video latency, say below 80ms, between two TX1 boards using the provided camera. One TX1 displays the video and connected via ethernet, and another TX1 connected via WiFi which will be encoding and sending the video stream. We have it setup like this is because we’ll eventually want to use a TX1 on a hexacopter/drone.
So far I’ve only been able to get the latency down to roughly 1/4th a second using the following commands. Note that I had to build gstreamer myself to get the rtph265pay and rtph265depay elements as discussed on GStreamer RTP H.265 elements missing thread.
On the display side I run the following, via SSH cause it was easier to do hence exporting DISPLAY environment variable:
As you may noticed that I am vertically flipping the video image (nvvidconv flip-method=6 part) cause for whatever reason the camer’s video is coming in flipped and I’m undoing that odd video result.
The overall network latency is low in our test setup since the boards are only maybe 10 feet away from the wireless access point and it’s on channel that doesn’t seem to be congested. We’re seeing an average of 3ms over 50 pings between said boards and naturally an occasional 150ms here and there as you normally do see over wireless blip up.
So there you have it. I am curious if anybody has suggestions or ideas how I go about reducing the latency to hopefully down to our maximum video latency target range.
I’ve never associated wireless with low latency. However, I’m wondering if your data uses a lot of small frames?
Basically tcp with several small amounts of data might delay the frame being sent until either there is a timeout or more data arrives, increasing latency. In the case of larger data (consistent with video) I might expect to instead require fragmenting the frame and then reassembling it (the reverse of waiting for more data). Should your data be large enough, jumbo frames might actually reduce latency. Whether or not you can set jumbo frames on wireless I don’t know, I’ve not heard of anyone trying. You could test it 100% wired first.
The other thing I’d wonder about is if your use can live with some dropped data or not. TCP and reliability actually requires two-way communications to validate whether a resend is needed. On wireless that’s an extra penalty. If you could live with some dropped data, you might try UDP.
A few things you could do just to get better understanding how different features might affect the latency:
Maximise CPU/GPU/EMC clocks.
Try 720p and 480p to see if it affects latency.
Try without nvvidconv flip-method=6 (You could that maybe more easily on the rendering side anyway)
Try rtpjitterbuffer instead of queue.
I’m streaming H264 on Jetson TK1 from an RC car and I’m also very interested about the latency. I got something like 150ms end-to-end (I used a blinking LED on front of the camera and measured with light sensors the time between the LED turning on and when the monitor shows it).
I have found the rtp payloader/depayloader can add significant buffering / latency. At least with H264 I had a much lower latency pipeline using an mpegts container, aggregating mpegts packets to the limit of a UDP frame (see: http://tipok.org.ua/taxonomy/term/125) and sending them directly over UDP without any RTP wrapper.
In case of H265 it is important to mention that the same Jetson TX1 was generating and consuming the stream.
About how to measure the latency basically I captured the stopwatch rendered in my laptop and displayed the captured stream in other, then I took a picture and subtracted the values. In this case I would think that the payloader adds latency, I wonder if one could optimize it using the NEON.
I saw in the wiki you mentioned there is no gstreamer H265 decoder for the PC… you can actually get a good decoder from the libav gstreamer integration: avdec_h265. I know it is available as of (at least) the 1.7.2 binaries.
Also, gstreamer 1.8.0 has mpegts support for H265 with proper UDP packetization:
I am trying this now, somewhat a different scenario. Using 4 aggregated cellular LTE links, 6 cameras and TX2. Will report back here. Need to find a good way to measure latency.
I am trying:
Over aggregated 4 carrier LTE cellular link
Over local WIFI
Streaming from one Jetson to 6 monitors on one PC vs. to 6 independent Raspberry Pi3’s
Just using one 1080p stream, visually, comparing to our atomic clock it ‘looks’ to be 500ms.
Which LTE modems/dongle are you using with Jetson? Do you know of a Jetson-compatible LTE USB Dongle that works in Europe? Nvidia doesn’t seem to “officially” support any cellular modem/dongle.