DeepStream SDK FAQ

1. Reading xxxx code is the best way to get the answer
2. The user of 4.0 needs C/C++ background
3. More resource:
https://docs.nvidia.com/metropolis/

nvinfer / streamMux / DeMux

  1. Gst-Nvinfer source code diagram

  2. How to get original NV12 frame buffer
    https://devtalk.nvidia.com/default/topic/1060956/deepstream-sdk/access-frame-pointer-in-deepstream-app/post/5375214/#5375214

  3. How to get detection confidence
    https://devtalk.nvidia.com/default/topic/1060849/deepstream-sdk/deepstream-v4-zero-confidence-problem/?offset=2#5372609
    https://devtalk.nvidia.com/default/topic/1058661/deepstream-sdk/nvinfer-is-not-populating-confidence-field-in-nvdsobjectmeta-ds-4-0-/post/5373361/#5373361

  4. nvinfer config “model-color-format” is defined in nvdsinfer_context.h and parsed in gstnvinfer_property_parser.cpp
    nvinfer supports not only bgr/rgb, but also gray and other formats.

/**
 * Enum for color formats.
 */
typedef enum
{
    /** 24-bit interleaved R-G-B */
    NvDsInferFormat_RGB,
    /** 24-bit interleaved B-G-R */
    NvDsInferFormat_BGR,
    /** 8-bit Luma */
    NvDsInferFormat_GRAY,
    /** 32-bit interleaved R-G-B-A */
    NvDsInferFormat_RGBA,
    /** 32-bit interleaved B-G-R-x */
    NvDsInferFormat_BGRx,
    NvDsInferFormat_Unknown = 0xFFFFFFFF,
} NvDsInferFormat;
  1. How to support each stream to deploy different aglorithm
diff --git a/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp b/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp
old mode 100644
new mode 100755
index c6867c87..cc70840c
--- a/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp
+++ b/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp
@@ -601,8 +601,10 @@ gst_nvinfer_sink_event (GstBaseTransform * trans, GstEvent * event)
     /* New source added in the pipeline. Create a source info instance for it. */
     guint source_id;
     gst_nvevent_parse_pad_added (event, &source_id);
-    nvinfer->source_info->emplace (source_id, GstNvInferSourceInfo ());
-  }
+    if (!nvinfer->process_full_frame && /* source_id is what your want for this SGIE */) {
+        nvinfer->source_info->emplace (source_id, GstNvInferSourceInfo ());
+      }
+    }
 
   if ((GstNvEventType) GST_EVENT_TYPE (event) == GST_NVEVENT_PAD_DELETED) {
     /* Source removed from the pipeline. Remove the related structure. */
@@ -1409,6 +1411,8 @@ gst_nvinfer_process_objects (GstNvInfer * nvinfer, GstBuffer * inbuf,
 
     /* Find the source info instance. */
     auto iter = nvinfer->source_info->find (frame_meta->pad_index);
+
+    /* If the source_id is not found, the object will be ignored */
     if (iter == nvinfer->source_info->end ()) {
       GST_WARNING_OBJECT
           (nvinfer,
  1. How to get/update source_id
    https://devtalk.nvidia.com/default/topic/1062520/deepstream-sdk/getting-source-stream-id-from-nvosd-plugin/post/5380861/#5380861
    How to set source_id in streammux?

  2. FP16 model issue
    If the weights in the model is outside of fp16 range, there will be uff parser issue as the below print:

NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files
ERROR nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger: NvDsInferContext[UID 1]:log(): UffParser: Parser error: bn_conv1/moving_variance: Weight 110542.968750 at index 8 is outside of [-65504.000000, 65504.000000]. Please try running the parser in a higher precision mode and setting the builder to fp16 mode instead.
NvDsInferCudaEngineGetFromTltModel: Failed to parse UFF model

In order to fix this issue, we can apply this patch to the nvinfer source code and build a new libnvds_infer.so to replace

--- a/src/utils/nvdsinfer/nvdsinfer_context_impl.cpp
+++ b/src/utils/nvdsinfer/nvdsinfer_context_impl.cpp
@@ -1851,7 +1851,7 @@ NvDsInferContextImpl::generateTRTModel(
         }
 
         if (!uffParser->parse(initParams.uffFilePath,
-                    *network, modelDataType))
+                    *network,DataType::kFLOAT))

8.Here’s a simple example CUDA kernel of cropping image: https://github.com/dusty-nv/jetson-video/blob/master/cuda/cudaCrop.cu

  1. How to deploy mrcnn model (GitHub - matterport/Mask_RCNN: Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow) with resnet50 backbone and classNum change in h5 model ?
    https://devtalk.nvidia.com/default/topic/1031938/deepstream-sdk/converting-mask-rcnn-to-tensor-rt/post/5416100/#5416100

  2. How to disable object detecdtion for different sources?
    https://devtalk.nvidia.com/default/topic/1068016/deepstream-sdk/can-deepstream-select-to-enable-or-disable-object-detection-for-different-sources/post/5410165/#5410165

metadata / msgconv / msgbroker / Codec/ ds-app
1. Sample of adding metadata
https://devtalk.nvidia.com/default/topic/1061083/deepstream-sdk/attaching-custom-type-metadata-to-gstreamer-buffer-on-src-pad-causing-sudden-crash/post/5374690/#5374690

2. Sample of customizing gst-dsexample:
https://devtalk.nvidia.com/default/topic/1061422/deepstream-sdk/how-to-crop-the-image-and-save/post/5375174/#5375174

3. Sample config file of running single RTSP source:
https://devtalk.nvidia.com/default/topic/1058086/deepstream-sdk/how-to-run-rtp-camera-in-deepstream-on-nano/post/5366807/#5366807

5. Sample of accessing NvBufSurface
https://devtalk.nvidia.com/default/topic/1061205/deepstream-sdk/rtsp-camera-access-frame-issue/post/5377678/#5377678

6.Use GST_PAD_PROBE_DROP macro to drop the buffer in the attached probe.
Refer to Pipeline manipulation for the example

static GstPadProbeReturn
event_probe_cb (GstPad * pad, GstPadProbeInfo * info, gpointer user_data)
{
    return GST_PAD_PROBE_DROP;
}

7.Add dsexample in ds-test1 app
https://devtalk.nvidia.com/default/topic/1065406/deepstream-sdk/enable-dsexample-in-test-app/?offset=3#5398407

8. Optical flow
Optical flow functionality is supported only on Jetson AGX Xavier and Turing GPUs T4 / RTX 2080 etc. It won’t work on Jetson Nano and GTX

9. How can we set “drop-frame-interval” more than 30 ?
a. Find and download “L4t source” from https://developer.nvidia.com/embedded/downloads#?search=source
gst-nvvideo4linux2_src.tbz2 is in public_sources.tbz2
b. Apply the patches “0001-gstv4l2dec-Fix-high-CPU-usage-in-drop-frame.patch” and " 0002-gst-v4l2dec-Increase-Drop-Frame-Interval.patch"
c. Build a new libgstnvvideo4linux2.so and replace /usr/lib/$(ARCH)/gstreamer-1.0/libgstnvvideo4linux2.so

0001-gstv4l2dec-Fix-high-CPU-usage-in-drop-frame.patch

From 5d8d5a0977473eae89c0f310171d2c7060e24eb6 Mon Sep 17 00:00:00 2001
From: vpagar <vpagar@nvidia.com>
Date: Thu, 5 Dec 2019 16:04:02 +0530
Subject: [PATCH 1/2] gstv4l2dec: Fix high CPU usage in drop-frame

In case of drop-frame-interval, in LL v4l2 implementation a
thread in low level v4l2 lib which sends buffer to block and
a callback thread spins between themselves causing high CPU percentage
usage over the perid.
This CL drops frame at the gstreamer level and LL v4l2 does not handle
dropping frames.

Unit-Test:
gst-launch-1.0 multifilesrc location= sample_720p.h264 \
! h264parse ! nvv4l2decoder drop-frame-interval=3 ! fakesink
and check CPU percentage usage in htop, it should stay stable.

Bug 200562189

Change-Id: I9af22745501d6a9892c341cb640dac16f8641763
---
 gst-v4l2/gstv4l2videodec.c | 23 ++++++++++++++++++++++-
 gst-v4l2/gstv4l2videodec.h |  1 +
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/gst-v4l2/gstv4l2videodec.c b/gst-v4l2/gstv4l2videodec.c
index 5531f9d..f8c62f2 100644
--- a/gst-v4l2/gstv4l2videodec.c
+++ b/gst-v4l2/gstv4l2videodec.c
@@ -593,6 +593,9 @@ gst_v4l2_video_dec_start (GstVideoDecoder * decoder)
   gst_v4l2_object_unlock (self->v4l2output);
   g_atomic_int_set (&self->active, TRUE);
   self->output_flow = GST_FLOW_OK;
+#if USE_V4L2_TARGET_NV
+  self->decoded_picture_cnt = 0;
+#endif
 
   return TRUE;
 }
@@ -704,6 +707,11 @@ gst_v4l2_video_dec_set_format (GstVideoDecoder * decoder,
     }
   }
 
+#if 0
+  /* *
+   * TODO: From low level library remove support of drop frame interval after
+   * analyzing high CPU utilization in initial implementation.
+   * */
   if (self->drop_frame_interval != 0) {
     if (!set_v4l2_video_mpeg_class (self->v4l2output,
         V4L2_CID_MPEG_VIDEODEC_DROP_FRAME_INTERVAL,
@@ -712,6 +720,7 @@ gst_v4l2_video_dec_set_format (GstVideoDecoder * decoder,
       return FALSE;
     }
   }
+#endif
 #ifndef USE_V4L2_TARGET_NV_CODECSDK
   if (self->disable_dpb != DEFAULT_DISABLE_DPB) {
     if (!set_v4l2_video_mpeg_class (self->v4l2output,
@@ -1141,10 +1150,21 @@ gst_v4l2_video_dec_loop (GstVideoDecoder * decoder)
       gst_caps_unref(reference);
     }
 
-    ret = gst_video_decoder_finish_frame (decoder, frame);
+#if USE_V4L2_TARGET_NV
+    if ((self->drop_frame_interval == 0) ||
+        (self->decoded_picture_cnt % self->drop_frame_interval == 0))
+        ret = gst_video_decoder_finish_frame (decoder, frame);
+    else
+        ret = gst_video_decoder_drop_frame (GST_VIDEO_DECODER (self), frame);
 
     if (ret != GST_FLOW_OK)
       goto beach;
+
+    self->decoded_picture_cnt += 1;
+#else
+    ret = gst_video_decoder_finish_frame (decoder, frame);
+#endif
+
   } else {
     GST_WARNING_OBJECT (decoder, "Decoder is producing too many buffers");
     gst_buffer_unref (buffer);
@@ -1696,6 +1716,7 @@ gst_v4l2_video_dec_init (GstV4l2VideoDec * self)
   self->skip_frames = DEFAULT_SKIP_FRAME_TYPE;
   self->nvbuf_api_version_new = DEFAULT_NVBUF_API_VERSION_NEW;
   self->drop_frame_interval = 0;
+  self->decoded_picture_cnt = 0;
   self->num_extra_surfaces = DEFAULT_NUM_EXTRA_SURFACES;
 #ifndef USE_V4L2_TARGET_NV_CODECSDK
   self->disable_dpb = DEFAULT_DISABLE_DPB;
diff --git a/gst-v4l2/gstv4l2videodec.h b/gst-v4l2/gstv4l2videodec.h
index 50d07c5..5015c30 100644
--- a/gst-v4l2/gstv4l2videodec.h
+++ b/gst-v4l2/gstv4l2videodec.h
@@ -71,6 +71,7 @@ struct _GstV4l2VideoDec
   GstFlowReturn output_flow;
   guint64 frame_num;
 #ifdef USE_V4L2_TARGET_NV
+  guint64 decoded_picture_cnt;
   guint32 skip_frames;
   guint32 drop_frame_interval;
   gboolean nvbuf_api_version_new;
-- 
2.17.1

0002-gst-v4l2dec-Increase-Drop-Frame-Interval.patch

From 52665605036144ac20628c95e52fdd82edae71b9 Mon Sep 17 00:00:00 2001
From: vpagar <vpagar@nvidia.com>
Date: Wed, 11 Dec 2019 11:56:25 +0530
Subject: [PATCH 2/2] gst-v4l2dec: Increase Drop Frame Interval

Bug 200575866

Change-Id: If5576683c0fad95832595838d032d3145b88ea36
---
 gst-v4l2/gstv4l2videodec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gst-v4l2/gstv4l2videodec.c b/gst-v4l2/gstv4l2videodec.c
index f8c62f2..00d7740 100644
--- a/gst-v4l2/gstv4l2videodec.c
+++ b/gst-v4l2/gstv4l2videodec.c
@@ -1807,7 +1807,7 @@ gst_v4l2_video_dec_class_init (GstV4l2VideoDecClass * klass)
           "Drop frames interval",
           "Interval to drop the frames,ex: value of 5 means every 5th frame will be given by decoder, rest all dropped",
           0,
-          30, 30,
+          G_MAXUINT, G_MAXUINT,
           G_PARAM_READWRITE | G_PARAM_STATIC_STRINGS | GST_PARAM_MUTABLE_READY));
 
   g_object_class_install_property (gobject_class, PROP_NUM_EXTRA_SURFACES,
-- 
2.17.1

10.use deepstream-app option
refer https://devtalk.nvidia.com/default/topic/1069070/deepstream-sdk/labels-disappear-with-multiple-sources/post/5415523/#5415523

Thank you very much ChrisDing. This is really helpful.

nvtracker standalone user sample

https://devtalk.nvidia.com/default/topic/1066252/deepstream-sdk/klt-nvmot-usage/

get nvtracker history

https://devtalk.nvidia.com/default/topic/1061798/deepstream-sdk/how-to-obtain-previous-states-of-tracked-object-/

Fix for a memory accumulation bug in GstBaseParse
A memory accumulation bug was found in GStreamer’s Base Parse class which potentially affects all codec parsers provided by GStreamer. This bug is seen only with long duration seekable streams (mostly containerized files e.g. mp4). This does not affect live sources like RTSP. We have filed an issue on GStreamer’s gitlab project (gstbaseparse: High memory usage in association index for long duration files (#468) · Issues · GStreamer / gstreamer · GitLab).

Temporary fix

  1. Check the exact gstreamer version installed on the system.

$ gst-inspect-1.0 --version

gst-inspect-1.0 version 1.14.5

GStreamer 1.14.5

https://launchpad.net/distros/ubuntu/+source/gstreamer1.0

  1. Clone the Gstreamer repo and checkout the tag corresponding to the installed version

$ git clone git@gitlab.freedesktop.org:gstreamer/gstreamer.git

$ cd gstreamer

$ git checkout 1.14.5

  1. Make sure build dependencies are installed

$ sudo apt install libbison-dev build-essential flex debhelper

  1. Run autogen.sh and configure script

$ ./autogen.sh –noconfigure

$ ./configure –prefix=$(pwd)/out # Don’t want to overwrite system libs

  1. Save the following patch to a file
diff --git a/libs/gst/base/gstbaseparse.c b/libs/gst/base/gstbaseparse.c
index 41adf130e..ffc662a45 100644
--- a/libs/gst/base/gstbaseparse.c
+++ b/libs/gst/base/gstbaseparse.c
@@ -1906,6 +1906,9 @@ gst_base_parse_add_index_entry (GstBaseParse * parse, guint64 offset,
   GST_LOG_OBJECT (parse, "Adding key=%d index entry %" GST_TIME_FORMAT
       " @ offset 0x%08" G_GINT64_MODIFIER "x", key, GST_TIME_ARGS (ts), offset);
 
+  if (!key)
+    goto exit;
+
   if (G_LIKELY (!force)) {
 
     if (!parse->priv->upstream_seekable) {
  1. Apply the patch

$ cat patch.txt | patch -p1

  1. Build the sources

$ make -j$(nproc) && make install

  1. Backup the distribution provided library and copy the newly built library. Adjust the library name for version. For jetson replace x86_64-linux-gnu with aarch64-linux-gnu

$ sudo cp /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0.1405.0 ${HOME}/libgstbase-1.0.so.0.1405.0.backup

$ sudo cp out/lib/libgstbase-1.0.so.0.1405.0 /usr/lib/x86_64-linux-gnu/

[DS5.0 xx_All_App] For DS 5.0 DP: how to integrate nvdsanalytics plugin in C deepstream-app

  1. User need to create analytics bin in /opt/nvidia/deepstream/deepstream-5.0/sources/apps/apps-common/src
  2. Refer deepstream_dsexample.c and similarly create deepstream_nvdsanalytics.c
  3. deepstream_app.h should be modified to add the instance of nvdsanalytics bin and config in the structures
  4. deepstream_config_file_parser.c needs to updated for parsing of nvdsanalytics config from configuration file
  5. deepstream_app.c should be updated for adding the nvdsanalytics bin in the pipeline, ideally location is after the tracker
  6. Create a new cpp file with process_meta function declared with extern “C”, this will parse the meta for nvdsanalytics, refer sample nvdanalytics test app probe call for creation of the function
  7. Add the probe in deepstream_app_main.c after nvdsanalytics bin
  8. Modify Makefile to compile the cpp and deepstream_app_main.c using g++ with -fpermisive flag and link deepstream-app using g++

These are rough steps, but additional modifications in header files required

For DS 5.0 GA we would be adding the support for meta access

DeepStream 5.0 Manual for YoloV4

  • The original Yolo implementation via CUDA kernel in DeepStream is based on old Yolo models (v2, v3) so it may not suit new Yolo models like YoloV4. Location: /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/kernels.cu

  • We are trying to embed Yolo layer into tensorRT engine while converting darknet or pytorch into engine, this is before deploying to DeepStream. This new solution would cause the old Yolo cuda kernel in DeepStream no longer to be used.

You can try following steps to make DeepStream working for YoloV4:

  1. go to https://github.com/Tianxiaomo/pytorch-YOLOv4 to generate a TensorRT engine according to this workflow: DarkNet or Pytorch → ONNX → TensorRT.
  2. Add following C++ functions into objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp and rebuild libnvdsinfer_custom_impl_Yolo.so
  3. Here are configuration files for you as references (You have to update a little to suit your environment):
    config_infer_primary_yoloV4.txt (3.4 KB)
    deepstream_app_config_yoloV4.txt (3.8 KB)
static NvDsInferParseObjectInfo convertBBoxYoloV4(const float& bx1, const float& by1, const float& bx2,
                                     const float& by2, const uint& netW, const uint& netH)
{
    NvDsInferParseObjectInfo b;
    // Restore coordinates to network input resolution

    float x1 = bx1 * netW;
    float y1 = by1 * netH;
    float x2 = bx2 * netW;
    float y2 = by2 * netH;

    x1 = clamp(x1, 0, netW);
    y1 = clamp(y1, 0, netH);
    x2 = clamp(x2, 0, netW);
    y2 = clamp(y2, 0, netH);

    b.left = x1;
    b.width = clamp(x2 - x1, 0, netW);
    b.top = y1;
    b.height = clamp(y2 - y1, 0, netH);

    return b;
}

static void addBBoxProposalYoloV4(const float bx, const float by, const float bw, const float bh,
                     const uint& netW, const uint& netH, const int maxIndex,
                     const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
    NvDsInferParseObjectInfo bbi = convertBBoxYoloV4(bx, by, bw, bh, netW, netH);
    if (bbi.width < 1 || bbi.height < 1) return;

    bbi.detectionConfidence = maxProb;
    bbi.classId = maxIndex;
    binfo.push_back(bbi);
}

static std::vector<NvDsInferParseObjectInfo>
decodeYoloV4Tensor(
    const float* boxes, const float* scores,
    const uint num_bboxes, NvDsInferParseDetectionParams const& detectionParams,
    const uint& netW, const uint& netH)
{
    std::vector<NvDsInferParseObjectInfo> binfo;

    uint bbox_location = 0;
    uint score_location = 0;
    for (uint b = 0; b < num_bboxes; ++b)
    {
        float bx1 = boxes[bbox_location];
        float by1 = boxes[bbox_location + 1];
        float bx2 = boxes[bbox_location + 2];
        float by2 = boxes[bbox_location + 3];

        float maxProb = 0.0f;
        int maxIndex = -1;

        for (uint c = 0; c < detectionParams.numClassesConfigured; ++c)
        {
            float prob = scores[score_location + c];
            if (prob > maxProb)
            {
                maxProb = prob;
                maxIndex = c;
            }
        }

        if (maxProb > detectionParams.perClassPreclusterThreshold[maxIndex])
        {
            addBBoxProposalYoloV4(bx1, by1, bx2, by2, netW, netH, maxIndex, maxProb, binfo);
        }

        bbox_location += 4;
        score_location += detectionParams.numClassesConfigured;
    }

    return binfo;
}

extern "C" bool NvDsInferParseCustomYoloV4(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams,
    std::vector<NvDsInferParseObjectInfo>& objectList)
{
    if (NUM_CLASSES_YOLO != detectionParams.numClassesConfigured)
    {
        std::cerr << "WARNING: Num classes mismatch. Configured:"
                  << detectionParams.numClassesConfigured
                  << ", detected by network: " << NUM_CLASSES_YOLO << std::endl;
    }

    std::vector<NvDsInferParseObjectInfo> objects;

    const NvDsInferLayerInfo &boxes = outputLayersInfo[0]; // num_boxes x 4
    const NvDsInferLayerInfo &scores = outputLayersInfo[1]; // num_boxes x num_classes

    // 3 dimensional: [num_boxes, 1, 4]
    assert(boxes.inferDims.numDims == 3);
    // 2 dimensional: [num_boxes, num_classes]
    assert(scores.inferDims.numDims == 2);

    // The second dimension should be num_classes
    assert(detectionParams.numClassesConfigured == scores.inferDims.d[1]);
    
    uint num_bboxes = boxes.inferDims.d[0];

    // std::cout << "Network Info: " << networkInfo.height << "  " << networkInfo.width << std::endl;

    std::vector<NvDsInferParseObjectInfo> outObjs =
        decodeYoloV4Tensor(
            (const float*)(boxes.buffer), (const float*)(scores.buffer), num_bboxes, detectionParams,
            networkInfo.width, networkInfo.height);

    objects.insert(objects.end(), outObjs.begin(), outObjs.end());

    objectList = objects;

    return true;
}

1. [DS5.0GA_Jetson_dGPU_Plugin] Measure of the FPS of pipeline

2. [DS5.0GA_Jetson_dGPU_Plugin] Dump the Inference Input

3. [DS5_Jetson_dGPU_Plugin] Dump the Inference outputs

  • apply Attached dump_dsinfer_raw_TRT_infer_outputs.txt (1.8 KB) into /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer/
  • build libnvds_infer.so and replace /opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_infer.so

4. [DS5.0GA_Jetson_App] Rotate camera input image with NvBufSurfTransform() API

5. [DS5.0GA_App] Generate GStreamer Pipeline Graph
Use one of below method according to your application type to generate the GST pipeline graph.

4.1 deepstream-app
 run "export GST_DEBUG_DUMP_DOT_DIR=/tmp/" before deepstream-app command, e.g.
 $ sudo apt-get install graphviz
 $ export GST_DEBUG_DUMP_DOT_DIR=/tmp/
 $ deepstream-app -c deepstream_app_config_yoloV2.txt
 $ cd   /tmp/
 $ dot -Tpng 0.03.47.898178403-ds-app-playing.dot >~/0.03.47.898178403-ds-app-playing.png  // png file includes the graph

4.2 gstreamer command line
for exmaple,
  $ run "export GST_DEBUG_DUMP_DOT_DIR=/tmp/" before deepstream-app command, e.g.
  $ sudo apt-get install graphviz
  $ export GST_DEBUG_DUMP_DOT_DIR=/tmp/
  $ gst-launch-1.0 ....
  $  cd  /tmp/
  $ dot -Tpng 0.03.47.898178403-ds-app-playing.dot >~/0.03.47.898178403-ds-app-playing.png  // png file includes the graph

 4.3 DeepStream application
  for exmaple
  4.3.1 add "g_setenv("GST_DEBUG_DUMP_DOT_DIR", "/tmp", TRUE);" before  gst_init()
  4.3.2 add "GST_DEBUG_BIN_TO_DOT_FILE_WITH_TS(GST_BIN(gst_objs.pipeline), GST_DEBUG_GRAPH_SHOW_ALL, "demo-app-pipeline");" at the point where want to export the dot file, e.g. when switching to PLAYING
   BTW, need to include header file -   #include <gio/gio.h>

 4.4 Python DeepStream
  Refer to https://forums.developer.nvidia.com/t/python-deepstream-program-not-generating-dot-file/163837/8?u=mchi

6. [DS 5.0.1_All_Plugin] Tracker FAQ topic Deepstream Tracker FAQ

7. [DS 5.0GA_All_App] Enable Latency measurement for deepstream sample apps

  1. If you are using deepstream-app, to check the component latency directly, you need to set the env

    1. export NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT=1
      export NVDS_ENABLE_LATENCY_MEASUREMENT=1
  2. If you are using other deepstream sample apps such as deepstream-test3, you need to apply the following patch and set the env

    1. export NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT=1
      export NVDS_ENABLE_LATENCY_MEASUREMENT=1
diff --git a/apps/deepstream/sample_apps/deepstream-test3/deepstream_test3_app.c b/apps/deepstream/sample_apps/deepstream-test3/deepstream_test3_app.c
index 426bd69..c7c2472 100644
--- a/apps/deepstream/sample_apps/deepstream-test3/deepstream_test3_app.c
+++ b/apps/deepstream/sample_apps/deepstream-test3/deepstream_test3_app.c
@@ -26,6 +26,7 @@
 #include <math.h>
 #include <string.h>
 #include <sys/time.h>
+#include <stdlib.h>

 #include "gstnvdsmeta.h"
 //#include "gstnvstreammeta.h"
@@ -73,6 +74,41 @@ gchar pgie_classes_str[4][32] = { "Vehicle", "TwoWheeler", "Person",

 //static guint probe_counter = 0;

+typedef struct {
+  GMutex *lock;
+  int num_sources;
+}LatencyCtx;
+
+static GstPadProbeReturn
+latency_measurement_buf_prob(GstPad * pad, GstPadProbeInfo * info, gpointer u_data)
+{
+  LatencyCtx *ctx = (LatencyCtx *) u_data;
+  static int batch_num = 0;
+  guint i = 0, num_sources_in_batch = 0;
+  if(nvds_enable_latency_measurement)
+  {
+    GstBuffer *buf = (GstBuffer *) info->data;
+    NvDsFrameLatencyInfo *latency_info = NULL;
+    g_mutex_lock (ctx->lock);
+    latency_info = (NvDsFrameLatencyInfo *)
+      calloc(1, ctx->num_sources * sizeof(NvDsFrameLatencyInfo));;
+    g_print("\n************BATCH-NUM = %d**************\n",batch_num);
+    num_sources_in_batch = nvds_measure_buffer_latency(buf, latency_info);
+
+    for(i = 0; i < num_sources_in_batch; i++)
+    {
+      g_print("Source id = %d Frame_num = %d Frame latency = %lf (ms) \n",
+          latency_info[i].source_id,
+          latency_info[i].frame_num,
+          latency_info[i].latency);
+    }
+    g_mutex_unlock (ctx->lock);
+    batch_num++;
+  }
+
+  return GST_PAD_PROBE_OK;
+}
+
 /* tiler_sink_pad_buffer_probe  will extract metadata received on OSD sink pad
  * and update params for drawing rectangle, object information etc. */

@@ -107,9 +143,9 @@ tiler_src_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info,
                 num_rects++;
             }
         }
-          g_print ("Frame Number = %d Number of objects = %d "
-            "Vehicle Count = %d Person Count = %d\n",
-            frame_meta->frame_num, num_rects, vehicle_count, person_count);
+          // g_print ("Frame Number = %d Number of objects = %d "
+          //   "Vehicle Count = %d Person Count = %d\n",
+          //   frame_meta->frame_num, num_rects, vehicle_count, person_count);
 #if 0
         display_meta = nvds_acquire_display_meta_from_pool(batch_meta);
         NvOSD_TextParams *txt_params  = &display_meta->text_params;
@@ -383,7 +419,7 @@ main (int argc, char *argv[])
 #ifdef PLATFORM_TEGRA
   transform = gst_element_factory_make ("nvegltransform", "nvegl-transform");
 #endif
-  sink = gst_element_factory_make ("nveglglessink", "nvvideo-renderer");
+  sink = gst_element_factory_make ("fakesink", "nvvideo-renderer");

   if (!pgie || !tiler || !nvvidconv || !nvosd || !sink) {
     g_printerr ("One element could not be created. Exiting.\n");
@@ -467,6 +503,18 @@ gst_bin_add_many (GST_BIN (pipeline), queue1, pgie, queue2, tiler, queue3,
         tiler_src_pad_buffer_probe, NULL, NULL);
   gst_object_unref (tiler_src_pad);

+  GstPad *sink_pad =  gst_element_get_static_pad (nvosd, "src");
+  if (!sink_pad)
+    g_print ("Unable to get src pad\n");
+  else {
+    LatencyCtx *ctx = (LatencyCtx *)g_malloc0(sizeof(LatencyCtx));
+    ctx->lock = (GMutex *)g_malloc0(sizeof(GMutex));
+    ctx->num_sources = num_sources;
+    gst_pad_add_probe (sink_pad, GST_PAD_PROBE_TYPE_BUFFER,
+        latency_measurement_buf_prob, ctx, NULL);
+  }
+  gst_object_unref (sink_pad);
+
   /* Set the pipeline to "playing" state */
   g_print ("Now playing:");
   for (i = 0; i < num_sources; i++) {

3.If you use a python app such as deepstream_test_3.py, you need to apply the following patch and set the env

export NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT=1
export NVDS_ENABLE_LATENCY_MEASUREMENT=1

pip install cffi

Then execute like below

python3 deepstream_test_3.py --no-display -i rtsp://"your_rtsp_uri_0" uri1
diff --git a/apps/deepstream-test3/deepstream_test_3.py b/apps/deepstream-test3/deepstream_test_3.py
index d81ec92..21d2f3b 100755
--- a/apps/deepstream-test3/deepstream_test_3.py
+++ b/apps/deepstream-test3/deepstream_test_3.py
@@ -36,6 +36,28 @@ from common.FPS import PERF_DATA
 
 import pyds
 
+from cffi import FFI
+
+ffi = FFI()
+
+clib = None
+
+ffi.cdef("""
+typedef struct
+{
+  uint32_t source_id;
+  uint32_t frame_num;
+  double comp_in_timestamp;
+  double latency;
+} NvDsFrameLatencyInfo;
+
+uint32_t nvds_measure_buffer_latency(void *buf, NvDsFrameLatencyInfo *latency_info);
+bool nvds_get_enable_latency_measurement();
+""")
+
+# Compile the C sources to produce the following .dll (or .so under *nix)
+clib = ffi.dlopen("/opt/nvidia/deepstream/deepstream/lib/libnvdsgst_meta.so")
+
 no_display = False
 silent = False
 file_loop = False
@@ -56,6 +78,27 @@ OSD_PROCESS_MODE= 0
 OSD_DISPLAY_TEXT= 1
 pgie_classes_str= ["Vehicle", "TwoWheeler", "Person","RoadSign"]
 
+batch_num = 0
+
+def osd_src_pad_buffer_probe(pad, info, u_data):
+    number_source = u_data
+    gst_buffer = info.get_buffer()
+    if not gst_buffer:
+        print("Unable to get GstBuffer ")
+        return
+    global batch_num
+    if clib.nvds_get_enable_latency_measurement:
+        print(f"************BATCH-NUM = {batch_num}**************")
+        c_gst_buf = ffi.cast("void *", hash(gst_buffer))
+        cNvDsFrameLatencyInfo = ffi.new(f"NvDsFrameLatencyInfo[{number_source}]")
+        sources = clib.nvds_measure_buffer_latency(c_gst_buf, cNvDsFrameLatencyInfo)
+        for i in range(sources):
+            print(f"Source id = {cNvDsFrameLatencyInfo[i].source_id} "
+                  f"Frame_num = {cNvDsFrameLatencyInfo[i].frame_num} "
+                  f"Frame latency = {cNvDsFrameLatencyInfo[i].latency} (ms) ")
+        batch_num += 1
+    return Gst.PadProbeReturn.OK
+
 # pgie_src_pad_buffer_probe  will extract metadata received on tiler sink pad
 # and update params for drawing rectangle, object information etc.
 def pgie_src_pad_buffer_probe(pad,info,u_data):
@@ -199,7 +242,7 @@ def create_source_bin(index,uri):
         return None
     return nbin
 
-def main(args, requested_pgie=None, config=None, disable_probe=False):
+def main(args, requested_pgie=None, config=None, disable_probe=True):
     global perf_data
     perf_data = PERF_DATA(len(args))
 
@@ -380,6 +423,12 @@ def main(args, requested_pgie=None, config=None, disable_probe=False):
             # perf callback function to print fps every 5 sec
             GLib.timeout_add(5000, perf_data.perf_print_callback)
 
+    osd_src_pad=nvosd.get_static_pad("src")
+    if not osd_src_pad:
+        sys.stderr.write(" Unable to get src pad \n")
+    else:
+        osd_src_pad.add_probe(Gst.PadProbeType.BUFFER, osd_src_pad_buffer_probe, number_sources)
+
     # List the sources
     print("Now playing...")
     for i, source in enumerate(args):

8. [DS 5.0GA_All_App] Enable Perf measurement(FPS) for deepstream sample apps

  1. If you are using deepstream-app, you can add enable-perf-measurement=1 under Application Group in the config file
  2. If you are using other deepstream sample apps such as deepstream-test2, you can apply following patch to enable it
diff --git a/sources/apps/sample_apps/deepstream-test2/deepstream_test2_app.c b/sources/apps/sample_apps/deepstream-test2/deepstream_test2_app.c
index a2231acf535b4826adb766ed28f3aa80294c7f82..e37d7504ed07c9db77e5d3cdac2c4943fd0d1010 100755
--- a/sources/apps/sample_apps/deepstream-test2/deepstream_test2_app.c
+++ b/sources/apps/sample_apps/deepstream-test2/deepstream_test2_app.c
@@ -28,6 +28,7 @@
 #include <string.h>
 
 #include "gstnvdsmeta.h"
+#include "deepstream_perf.h"
 
 #define PGIE_CONFIG_FILE  "dstest2_pgie_config.txt"
 #define SGIE1_CONFIG_FILE "dstest2_sgie1_config.txt"
@@ -51,6 +52,29 @@
  * based on the fastest source's framerate. */
 #define MUXER_BATCH_TIMEOUT_USEC 40000
 
+#define MAX_STREAMS 64
+
+typedef struct
+{
+    /** identifies the stream ID */
+    guint32 stream_index;
+    gdouble fps[MAX_STREAMS];
+    gdouble fps_avg[MAX_STREAMS];
+    guint32 num_instances;
+    guint header_print_cnt;
+    GMutex fps_lock;
+    gpointer context;
+
+    /** Test specific info */
+    guint32 set_batch_size;
+}DemoPerfCtx;
+
+
+typedef struct {
+  GMutex *lock;
+  int num_sources;
+}LatencyCtx;
+
 gint frame_number = 0;
 /* These are the strings of the labels for the respective models */
 gchar sgie1_classes_str[12][32] = { "black", "blue", "brown", "gold", "green",
@@ -80,6 +104,66 @@ guint sgie1_unique_id = 2;
 guint sgie2_unique_id = 3;
 guint sgie3_unique_id = 4;
 
+/**
+ * callback function to print the performance numbers of each stream.
+ */
+static void
+perf_cb (gpointer context, NvDsAppPerfStruct * str)
+{
+  DemoPerfCtx *thCtx = (DemoPerfCtx *) context;
+
+  g_mutex_lock(&thCtx->fps_lock);
+  /** str->num_instances is == num_sources */
+  guint32 numf = str->num_instances;
+  guint32 i;
+
+  for (i = 0; i < numf; i++) {
+    thCtx->fps[i] = str->fps[i];
+    thCtx->fps_avg[i] = str->fps_avg[i];
+  }
+  thCtx->context = thCtx;
+  g_print ("**PERF: ");
+  for (i = 0; i < numf; i++) {
+    g_print ("%.2f (%.2f)\t", thCtx->fps[i], thCtx->fps_avg[i]);
+  }
+  g_print ("\n");
+  g_mutex_unlock(&thCtx->fps_lock);
+}
+
+/**
+ * callback function to print the latency of each component in the pipeline.
+ */
+
+static GstPadProbeReturn
+latency_measurement_buf_prob(GstPad * pad, GstPadProbeInfo * info, gpointer u_data)
+{
+  LatencyCtx *ctx = (LatencyCtx *) u_data;
+  static int batch_num = 0;
+  guint i = 0, num_sources_in_batch = 0;
+  if(nvds_enable_latency_measurement)
+  {
+    GstBuffer *buf = (GstBuffer *) info->data;
+    NvDsFrameLatencyInfo *latency_info = NULL;
+    g_mutex_lock (ctx->lock);
+    latency_info = (NvDsFrameLatencyInfo *)
+      calloc(1, ctx->num_sources * sizeof(NvDsFrameLatencyInfo));;
+    g_print("\n************BATCH-NUM = %d**************\n",batch_num);
+    num_sources_in_batch = nvds_measure_buffer_latency(buf, latency_info);
+
+    for(i = 0; i < num_sources_in_batch; i++)
+    {
+      g_print("Source id = %d Frame_num = %d Frame latency = %lf (ms) \n",
+          latency_info[i].source_id,
+          latency_info[i].frame_num,
+          latency_info[i].latency);
+    }
+    g_mutex_unlock (ctx->lock);
+    batch_num++;
+  }
+
+  return GST_PAD_PROBE_OK;
+}
+
 /* This is the buffer probe function that we have registered on the sink pad
  * of the OSD element. All the infer elements in the pipeline shall attach
  * their metadata to the GstBuffer, here we will iterate & process the metadata
@@ -144,9 +228,9 @@ osd_sink_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info,
         nvds_add_display_meta_to_frame(frame_meta, display_meta);
     }
 
-    g_print ("Frame Number = %d Number of objects = %d "
-            "Vehicle Count = %d Person Count = %d\n",
-            frame_number, num_rects, vehicle_count, person_count);
+    // g_print ("Frame Number = %d Number of objects = %d "
+    //         "Vehicle Count = %d Person Count = %d\n",
+    //         frame_number, num_rects, vehicle_count, person_count);
     frame_number++;
     return GST_PAD_PROBE_OK;
 }
@@ -586,6 +670,30 @@ main (int argc, char *argv[])
     gst_pad_add_probe (osd_sink_pad, GST_PAD_PROBE_TYPE_BUFFER,
         osd_sink_pad_buffer_probe, NULL, NULL);
 
+  GstPad *sink_pad =  gst_element_get_static_pad (nvvidconv1, "src");
+  if (!sink_pad)
+    g_print ("Unable to get sink pad\n");
+  else {
+    LatencyCtx *ctx = (LatencyCtx *)g_malloc0(sizeof(LatencyCtx));
+    ctx->lock = (GMutex *)g_malloc0(sizeof(GMutex));
+    ctx->num_sources = argc - 2;
+    gst_pad_add_probe (sink_pad, GST_PAD_PROBE_TYPE_BUFFER,
+        latency_measurement_buf_prob, ctx, NULL);
+  }
+  gst_object_unref (sink_pad);
+
+  GstPad *tiler_pad =  gst_element_get_static_pad (nvtiler, "sink");
+  if (!tiler_pad)
+    g_print ("Unable to get tiler_pad pad\n");
+  else {
+    NvDsAppPerfStructInt *str =  (NvDsAppPerfStructInt *)g_malloc0(sizeof(NvDsAppPerfStructInt));
+    DemoPerfCtx *perf_ctx = (DemoPerfCtx *)g_malloc0(sizeof(DemoPerfCtx));
+    g_mutex_init(&perf_ctx->fps_lock);
+    str->context = perf_ctx;
+    enable_perf_measurement (str, tiler_pad, argc-2, 1, 0, perf_cb);
+  }
+  gst_object_unref (tiler_pad);
+
   /* Set the pipeline to "playing" state */
   g_print ("Now playing: %s\n", argv[1]);
   gst_element_set_state (pipeline, GST_STATE_PLAYING);

9. [DS 5.0GA_Jetson_App] Capture HW & SW Memory Leak log
nvmemstat.py.txt (4.7 KB)

  1. Download attachment on
    to Jetson device and rename to nvmemstat.py
  2. Install “lsof” tool
    $ sudo apt-get install lsof
  3. Run your application on Jetson in one terminal or background
  4. Run this script with command :
    $ sudo python3 nvmemstat.py -p PROGRAM_NAME // replace PROGRAM_NAME to application name in step#2
    this script will monitor the hardware memory, SW memory, etc.
  5. Share the log on the topic for further triage

10. [ALL_Jetson_plugin] Jetson GStreamer Plugins Using with DeepStream
For the user of Jetson DeepStream (JetPack), there are some accelerated gstreamer plugins which is hardware accelerated by Jetson but are not listed in DeepStream plugin list GStreamer Plugin Overview — DeepStream 6.2 Release documentation.

Some of these plugins can be used in the DeepStream pipeline to extend the DeepStream functions while some of them are not compatible to DeepStreamSDK.

The basic document for the Gstreamer accelerated plugins is Multimedia — Jetson Linux
Developer Guide 34.1 documentation (nvidia.com)

DeepStream compatible plugins:

  • nvegltransform: NvEGLTransform

Typical usage:

gst-launch-1.0 uridecodebin uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! nvtracker tracker-width=640 tracker-height=480 ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=config_tracker_NvDCF_perf.yml enable-batch-process=1 ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA' ! nvmultistreamtiler ! nvdsosd ! nvvideoconvert ! nvegltransform ! nveglglessink

  • nvarguscamerasrc: nvarguscamerasrc: NvArgusCameraSrc

Typical usage:

gst-launch-1.0 nvarguscamerasrc bufapi-version=true sensor-id=0 ! 'video/x-raw(memory:NVMM),width=640,height=480,framerate=30/1,format=NV12' ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! nvtracker tracker-width=640 tracker-height=480 ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=config_tracker_NvDCF_perf.yml enable-batch-process=1 ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA' ! nvmultistreamtiler ! nvdsosd ! nvvideoconvert ! nvegltransform ! nveglglessink

For DeepStream 6.2 GA, the pipeline should be

gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! 'video/x-raw(memory:NVMM),width=640,height=480,framerate=30/1,format=NV12' ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! nvtracker tracker-width=640 tracker-height=480 ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=config_tracker_NvDCF_perf.yml enable-batch-process=1 ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA' ! nvmultistreamtiler ! nvdsosd ! nvvideoconvert ! nvegltransform ! nveglglessink

The related topic in forum:

Segfault when nvvideoconvert and nvv4l2h265enc are used together - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums

  • nvv4l2camerasrc: nvv4l2camerasrc: NvV4l2CameraSrc

Typical usage:

gst-launch-1.0 nvv4l2camerasrc device=/dev/video0 bufapi-version=1 ! 'video/x-raw(memory:NVMM),width=1920,height=1080,framerate=60/1' ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=NV12' ! mx.sink_0 nvv4l2camerasrc device=/dev/video1 bufapi-version=1 ! 'video/x-raw(memory:NVMM),width=1920,height=1080,framerate=60/1' ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=NV12' ! mx.sink_1 nvstreammux width=1920 height=1080 batch-size=2 live-source=1 name=mx ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=2 ! nvvideoconvert ! nvmultistreamtiler width=1920 height=1080 rows=1 columns=2 ! nvvideoconvert ! nvdsosd ! nvegltransform ! nveglglessink sync=0

The related topic in forum:
Low camera frame rate - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums

  • nvdrmvideosink: Nvidia Drm Video Sink

Typical pipeline:
gst-launch-1.0 filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! nvvideoconvert ! m.sink_0 nvstreammux name=m batch-size=1 width=1920 height=1080 ! nvinfer config-file-path= /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! nvdrmvideosink conn_id=0 plane_id=1 set_mode=0 -e

The related topic in forum:
Which videosink for Jetson TX2 in EGLFS? - Jetson & Embedded Systems / Jetson TX2 - NVIDIA Developer Forums

  • nv3dsink: Nvidia 3D sink

Typical pipeline:
gst-launch-1.0 filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! nvvideoconvert ! m.sink_0 nvstreammux name=m batch-size=1 width=1920 height=1080 ! nvinfer config-file-path= /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! nv3dsink sync=false

Note: The nv3dsink plugin is a window-based rendering sink, and based on X11.

  • nvoverlaysink: OpenMax Video Sink

Typical pipeline:

gst-launch-1.0 filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! qtdemux ! h264parse ! nvv4l2decoder bufapi-version=1 ! nvvideoconvert ! m.sink_0 nvstreammux name=m batch-size=1 width=1920 height=1080 ! nvinfer config-file-path= /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! nvoverlaysink sync=0

Note:The nvoverlaysink plugin is deprecated in L4T release 32.1. Please use nvdrmvideosink or nv3dsink for rendering gst-v4l2 decoder output.

DeepStream Incompatible Plugins

Typical pipeline:
gst-launch-1.0 nvcompositor name=comp sink_0::xpos=0 sink_0::ypos=0 sink_0::width=960 sink_0::height=540 sink_1::xpos=960 sink_1::ypos=0 sink_1::width=960 sink_1::height=540 sink_2::xpos=0 sink_2::ypos=540 sink_2::width=1920 sink_2::height=540 ! nvegltransform ! nveglglessink \ filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! comp. \ filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! comp. \ filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! comp. -e

The related topic in forum:
How to Customize layout from Nvmultistream-tiler module from DeepStream - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums

12. [DS 5.0GA/DS6.2_Jetson_App]: Dump NV12 NvBufSurface into a YUV file
Each NV12 NvBufSurface includes two semi-planes which are not continuous in memory.
gstnvinfer_dump_NV12_NvBufSurface.patch (4.9 KB)

This is a sample change to /opt/nvidia/deepstream/deepstream-5.1/sources/gst-plugins/gst-nvinfer/gstnvinfer.cpp to dump the NV12 NvBufSurface before transforming to RGB data.
After getting the YUV file, we can view it in https://rawpixels.net/ as below

Note: Also verified with DeepStream 6.2 on Jetson

13. [DS 5.x_All_App] How to access and modify the NvBufSurface

Refer Deepstream sample code snippet - #3 by bcao

14. [All_Jetson_App] Check memory leakage with valgrind

  1. Install valgrind with below command
    $ sudo apt-get install valgrind valgrind-dbg
  2. Run application with below command
    $ valgrind --tool=memcheck --leak-check=full --num-callers=100 --show-leak-kinds=definite,indirect --track-origins=yes ./app
  1. [DSx_All_App] Debug Tips for DeepStream Accuracy Issue
    Ensure the image pre-process before inference aligns with the training pre-process.
    15.1 Confirm your model has got good accuracy in training and inference outside DeepStream
    15.2 nvinfer
    When deploying a ONNX model to DeepStream with nvinfer plugin, confirm below nvinfer parameters are set correctly to align with the corresponding settings in Training
    15.2.1 Input scale & offset
    1). net-scale-factor =
    2). offsets
    The usage of these two parameters are as below (from doc)


    15.2.2 Input Order
    1). network-input-order= // 0:NCHW 1:NHWC
    2). infer-dims= // if network-input-order=1, i.e. NHWC, infer-dims must be specified, otherwise, nvinfer can’t detect input dims automatically
    3). model-color-format= // 0: RGB 1: BGR 2: GRAY
    15.2.3 scale and padding
    1). maintain-aspect-ratio= // whether to maintain aspect ratio while scaling input
    2). symmetric-padding= // whether to pad image symmetrically while scaling input. By defaulut, it’s asymmetrical padding and the image will be scaled to top left corner.
    15.2.4 inference precision
    1). network-mode= // 0: FP32 1: INT8 2: FP16. If INT8 accuracy is not good, try FP16 or FP32
    15.2.5 threshold
    1). threshold=
    2). pre-cluster-threshold=
    3). Post-cluster-threshold=
    Above are some highlighted parameters for a quick check for accuracy. For more detailed informantion, please refer to nvinfer doc - Gst-nvinfer — DeepStream 6.2 Release documentation
    15.2.6 avoid missing the object that close to the border of the image(Version 6.2 and above)
    1). crop-objects-to-roi-boundary=1
    15.3 Dump the input or output of the nvinfer
    Below two items in DeepStream SDK FAQ - #9 by mchi
    2. [DS5.0GA_Jetson_dGPU_Plugin] Dump the Inference Input ==> compare the input between DS and your own standalone inference/training app
    3. [DS5_Jetson_dGPU_Plugin] Dump the Inference outputs ==> then apply your own parser offline check this output data
    15.4 Try to remove/replace the plugin with extra conversion.
    15.4.1 The following two pipelines have the same result before Gst-nvstreammux plugin, you can choose the second one to reduce extra conversions caused by videoconvert and Gst-nvvideoconvert.

    multifilesrc->jpegdec->videoconvert->nvvideoconvert->nvstreammux->nvinfer…
    multifilesrc->nvjpegdec->nvstreammux->nvinfer……
    

    15.4.2 If you use Gst-nvstreammux, please set the width and height paremeters to align with the video.
    We suggest you try to use the Gst-nvstreammux New instead of Gst-nvstreammux especially when the width or height of video is not multiples of 4. If you need to use Gst-nvstreammux New, please ensure that all sources have the same resolution.

16. [DeepStream 6.0 GA] python binding installation

Download the wheel files directly from Releases · NVIDIA-AI-IOT/deepstream_python_apps · GitHub