Seeking suggestions of faster way of saving images with jetson-inference camera processing

Hello,

I am currently using Detectnet from jetson-inference to detect cars. With the specifics of what I am trying to achieve is to save some of the frames based on a few events, for example, a specific class is detected on that frame.

I tried to leverage the save image functions in one of the utils headers in the repo https://github.com/dusty-nv/jetson-inference/blob/8ed492bfdc9e1b98f7711cd5ae224ce588635656/util/loadImage.cpp#L30

However, this function takes some time (~100 ms). The Detectnet model trained for my use case runs at 10 FPS on TX1. With the added save functionality, the FPS drops further and that is not highly acceptable as there are more chances of missing the actual events.

With this in mind, does anyone have some experience/suggestions on saving the images that would potentially take lesses time? Probably a different format or raw values? I imagine it would spend some time encoding the image too as they are saved in jpg format.

Any help is really appreciated.

Thank you!

Hi bhargavK, the saveImage() wrapper uses Qt image library and CPU to encode jpg/png. It’s not intended for realtime use.

If you had an SSD or high-speed SD card (UHS-II) plugged into your Jetson, you could conceivably dump the raw or YUV data to disk. Alternatively, you can modify detectnet-camera program to use GStreamer to encode the video as H.264/H.265 (GStreamer uses the TX2 hardware codec).

To do that, get started by consulting the L4T Accelerated GStreamer Guide for example gstreamer pipelines. You could save the camera data separately, out of band, using one of these pipelines from the command line. Or if it needs integrated with your program, then you will need to use the GStreamer ‘appsrc’ element to send the video frames from C/C++ into gstreamer. Please see these relevant topics about using appsrc:

Thaks for the links, dusty_nv.

As I mentioned, I want to save only some frames based on certain events, I think I will have to figure out something using gstreamer appsrc. To reiterate, I am trying to save images and not the entire video.

The links seem to have some leads, will try to understand how I could leverage these.

In addition to the video codecs, the NVIDIA-accelerated GStreamer elements also include nvjpegenc for hw-accelerated jpeg encoding. See this image encoding example pipeline from the L4T Accelerated GStreamer Guide:

gst-launch-1.0 videotestsrc num-buffers=1 ! 'video/x-raw, width=(int)640, height=(int)480, format=(string)I420' ! nvjpegenc ! filesink location=test.jpg -e

Hi @dusty_nv,

Thanks for the suggestions and relevant links. I was able to write a simple pipeline in C and have provided my code below.

#include <stdio.h>
#include <string.h>
#include <fstream>
#include <unistd.h>
#include <stdlib.h>
#include <gst/gst.h>
#include <gst/app/gstappsrc.h>

typedef struct {
    GstPipeline* pipeline;
//    GstElement* src;
    GstAppSrc* src;
    GstElement* filter;
    GstElement* encoder;
    GstElement* sink;

    GstClockTime timestamp;
    guint sourceid;
} gst_app_t;

static gst_app_t gst_app;

int main() {
    gst_app_t* app = &gst_app;
	GstStateChangeReturn state_ret;
    gst_init(NULL, NULL); //Initialize Gstreamer
    app->timestamp = 0; //Set timestamp to 0

	//Create pipeline, and pipeline elements
	app->pipeline = (GstPipeline*)gst_pipeline_new("mypipeline");
    app->src      = (GstAppSrc*) gst_element_factory_make("appsrc", "mysrc");
    app->filter   = gst_element_factory_make ("capsfilter", "myfilter");
    app->encoder  = gst_element_factory_make ("nvjpegenc", "myjpeg");
    app->sink     = gst_element_factory_make ("filesink"  , NULL);
	
    if (!app->pipeline ||
        !app->src      || !app->filter ||
        !app->encoder  ||
        !app->sink)  {
		printf("Error creating pipeline elements!\n");
        return -1;
	}

	//Attach elements to pipeline
	gst_bin_add_many(
		GST_BIN(app->pipeline), 
        (GstElement*) app->src,
        app->filter,
		app->encoder,
		app->sink,
		NULL);

	//Set pipeline element attributes
//    g_object_set (app->src, "num-buffers", 1, NULL);
    GstCaps* filtercaps = gst_caps_new_simple ("video/x-raw",
		"format", G_TYPE_STRING, "I420",
        "width", G_TYPE_INT, 640,
        "height", G_TYPE_INT, 360,
        "framerate", GST_TYPE_FRACTION, 1, 1,
		NULL);
    g_object_set (G_OBJECT (app->filter), "caps", filtercaps, NULL);
    g_object_set (G_OBJECT (app->sink), "location", "output.jpg", NULL);

	//Link elements together
	g_assert( gst_element_link_many(
        (GstElement*) app->src,
        app->filter,
		app->encoder,
		app->sink,
		NULL ) );

    //Play the pipeline
	state_ret = gst_element_set_state((GstElement*)app->pipeline, GST_STATE_PLAYING);
    g_assert(state_ret != GST_STATE_CHANGE_FAILURE);

    // get pointer to input file
    FILE* fp = fopen("test.yuv", "rb");
    g_assert(fp != NULL);
    // memory allocation and error check
    size_t fsize = 640*360*1.5;
    char* filebuf = (char*)malloc(fsize);
    if (filebuf == NULL) {
        printf("memory error\n");
        return -1;
    }

    // read to buffer
    size_t bytesread = fread(filebuf, 1, fsize, fp);

    // Actual databuffer
    GstBuffer *pushbuffer;
    GstFlowReturn ret;

    // Wrap the data
    pushbuffer = gst_buffer_new_wrapped(filebuf, fsize);

    ret = gst_app_src_push_buffer( app->src, pushbuffer); //Push data into pipeline

    g_assert(ret ==  GST_FLOW_OK);

    GstBus* bus = gst_element_get_bus((GstElement*) app->pipeline);
    GstMessage* msg = gst_bus_timed_pop_filtered(bus, GST_CLOCK_TIME_NONE, (GstMessageType)(GST_MESSAGE_ERROR | GST_MESSAGE_EOS));

    /* Parse message */
    if (msg != NULL) {
        GError *err;
        gchar *debug_info;

        switch (GST_MESSAGE_TYPE (msg)) {
              case GST_MESSAGE_ERROR:
                    gst_message_parse_error (msg, &err, &debug_info);
                    g_printerr ("Error received from element %s: %s\n", GST_OBJECT_NAME (msg->src), err->message);
                    g_printerr ("Debugging information: %s\n", debug_info ? debug_info : "none");
                    g_clear_error (&err);
                    g_free (debug_info);
                    break;
              case GST_MESSAGE_EOS:
                    g_print ("End-Of-Stream reached.\n");
                    break;
              default:
                    /* We should not reach here because we only asked for ERRORs and EOS */
                    g_printerr ("Unexpected message received.\n");
                    break;
        }
        gst_message_unref (msg);
    }
    // declare EOS
    gst_app_src_end_of_stream (GST_APP_SRC (app->src));
    // free resources
    gst_object_unref(bus);
    gst_element_set_state((GstElement*)app->pipeline, GST_STATE_NULL);
    gst_object_unref(app->pipeline);

return 0;
}

I generated the test.yuv file by using the following command.

gst-launch-1.0 videotestsrc num-buffers=1 ! 'video/x-raw, width=(int)640, height=(int)480, format=(string)I420, framerate=(fraction)1/1' ! filesink location=test.yuv -e

The code successfully writes the output to disk but it never ends unless I press ^c. (Maybe I am missing some checks/conditions?)

Moreover, as I had mentioned in my question, I want to utilize this in my detectnet script. So perhaps, I can add a class to replicate the same functionality and pass imgCPU/imgCUDA instead of ‘filebuf’ in the code above? (from https://github.com/dusty-nv/jetson-inference/blob/8ed492bfdc9e1b98f7711cd5ae224ce588635656/detectnet-camera/detectnet-camera.cpp#L178.)

In my gstreamer rtsp pipeline for detectnet, I am using NV12 format and I notice that nvjpegenc does not support ‘NV12’ if the data is not on (memory:NVMM). Is there any alternative that you would suggest?

I hope I am able to explain myself clearly. Looking forward to the suggestions.

Many thanks!

Yes, you’ll want to pass your imgCPU pointer to your appsrc element, where it will get ingested into the rest of your GStreamer pipeline.

You can find which input formats it supports by running gst-inspect-1.0 nvjpegenc command, that should list all the info for the element.

Right, my question was after the ‘inspection’. Here is the src information from gst-inspect

SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                 format: { I420, NV12 }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-raw
                 format: { I420, YV12, YUY2, UYVY, Y41B, Y42B, YVYU, Y444, RGB, BGR, RGBx, xRGB, BGRx, xBGR, GRAY8 }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]

Whereas, my gstreamer pipeline is as follows:

gst-launch-1.0 rtspsrc location=rtsp://uname:pass@ip/axis-media/media.amp ! decodebin ! nvvidconv ! video/x-raw, format=NV12, width=640, height=360 ! appsink name=mysink

This is because I am leveraging your ConvertRGBA with NV12 format.

So, to make my question more clear, I think the imgCPU will have data in NV12 format, hence it cannot be passed to the encoding pipeline as above using nvjpegenc. Sorry, if these are too many questions. I have a very little experience with gstreamer and different video representation formats.

Thanks!

The camera video is in NV12 until it’s converted to RGB here:

https://github.com/dusty-nv/jetson-inference/blob/8ed492bfdc9e1b98f7711cd5ae224ce588635656/detectnet-camera/detectnet-camera.cpp#L184

if( !camera->ConvertRGBA(imgCUDA, &imgRGBA) )
       printf("detectnet-camera:  failed to convert from NV12 to RGBA\n");

After this, use the imgRGBA, that should be easier to work with. Also you will want to modify the above code to the following:

if( !camera->ConvertRGBA(imgCUDA, &imgRGBA, true) )
       printf("detectnet-camera:  failed to convert from NV12 to RGBA\n");

This modification will make imgRGBA pointer accessible from both CPU/GPU (without ‘true’ parameter, it is GPU-only).
imgRGBA is in float4 format (with pixel values 0.0-255.0f), so you may want to convert it to unsigned char for gstreamer.

If you prefer to use another format, like YUV I420 or YV12, there are additional CUDA-accelerated colorspace conversion functions available here: https://github.com/dusty-nv/jetson-inference/blob/master/util/cuda/cudaYUV.h

Thanks, @dusty_nv!

I will enable zero copy so that I can access the pointer from CPU. I probably might have to use the colorspace conversion you mentioned because by running the following two simple pipelines, I see different results (attached).

gst-launch-1.0 videotestsrc num-buffers=1 ! video/x-raw, width=640, height=360, format=I420 ! nvjpegenc ! filesink location=i420.jpg
gst-launch-1.0 videotestsrc num-buffers=1 ! video/x-raw, width=640, height=360, format=RGBx ! nvjpegenc ! filesink location=rgba.jpg

Whereas, the following gives an error.

gst-launch-1.0 videotestsrc num-buffers=1 ! video/x-raw, width=640, height=360, format=RGBx ! nvvidconv ! video/x-raw, format=I420 ! nvjpegenc ! filesink location=rgbaconv.jpg
WARNING: erroneous pipeline: could not link videotestsrc0 to nvvconv0

i420.jpg
rgba.jpg

i420.jpg

rgba.jpg

I ended up using Tegra MMAPI because I understood it better than Gstreamer app development.

The nvvidconv examples from Accelerated GStreamer User Guide document,

VIDEO FORMAT CONVERSION WITH GSTREAMER-1.0

The NVIDIA proprietary nvvidconv Gstreamer-1.0 plug-in allows conversion between
OSS (raw) video formats and NVIDIA video formats. The nvvidconv plug-in currently
supports the format conversions described in this section

raw-yuv Input Formats
Currently nvvidconv supports the I420, UYVY, YUY2, YVYU, NV12, GRAY8, BGRx,
and RGBA raw-yuv input formats.

gst-launch-1.0 videotestsrc ! ‘video/x-raw, format=(string)UYVY,
width=(int)1280, height=(int)720’ ! nvvidconv !
'video/x-raw(memory:NVMM), format=(string)I420’ ! omxh264enc !
‘video/x-h264,
stream-format=(string)byte-stream’ ! h264parse ! qtmux ! filesink
location=test.mp4 -e

raw-gray Input Formats
Currently nvvidconv supports the GRAY8 raw-gray input format.

gst-launch-1.0 videotestsrc ! 'video/x-raw, format=(string)GRAY8,
width=(int)1280, height=(int)720’ ! nvvidconv !
'video/x-raw(memory:NVMM), format=(string)I420’ ! omxh264enc !
‘video/x-h264,
stream-format=(string)byte-stream’ ! h264parse ! qtmux ! filesink
location=test.mp4 -e

ya, MM API via v4l2 as a low-level API is another way to achieve your multimedia development needs.

Thanks chijen,

I am familiar with the GStreamer pipeline using the command line tool. I had some difficulties implementing the same/similar pipeline using C API and integrating with my project, so I used the MMAPI.

(I think we posted around the same time and it made me confused about your previous reply. If you were responding to the post in #9, my question was more specific to nvjpegenc element.)

bhargavK,
You are right. My posting is more straight from our official document that demonstrates using nvvidconv and that’s the issue you had error with. Using either JPEG encode or video encode, that’s the flow after format conversion.

"I had some difficulties implementing the same/similar pipeline using C API and integrating with my project, so I used the MMAPI.
=> if you have specific question, we’d be happy to explore. Thanks again.

Thanks, I might have misread from gst-inspect-1.0 BGRx as RGBx.

I just had trouble getting started with gstreamer application development and integrating with my current project. It felt like it would take more time understanding the app development using gst rather than using MMAPI.

However, I will keep learning gstreamer on the side and would post any questions if I have.

Really appreciate all the help I get from this forum.

Yes, like any other framework, GStreamer is multimedia framework in the Linux and it will take time to get familiar with its programming model. However, there are quite a lot of tutorial materials in the web to explore.