Usage of NvBuffer APIs

Hi Folks,

I would like to encode output of my openCV algorithms, on Tx1. I am reading frames using Argus API and feed them to my opencv algorithms. The output frames of the algorithm are then subsequently given to Gstreamer encode pipeline via ‘appsrc’.

Output of opencv algorithms is in YUV420 format. When these output frames are given to enocder, the encoded output seems to have its U and V plane switched and overall encoded video frames (as decoded by mplayer) are very blocky.

Could someone please help, debug why the encoded output (which is legal mp4 bitstream) is very very blocky ?

My code can be found at - GitHub - pcgamelore/SingleCameraPlaceholder: Sample code the read Jetson Tx1/Tx2 cameras, encode, process the frame, and encode processed output.

Thanks,

The input to encoder can be I420 or Nv12. Looks like you do not give input with correct format to encoder.

Hi DaneLLL

I am initializing encoder input format to I420.

m_pgstVideoMeta       = gst_buffer_add_video_meta_full(m_pgstBuffer,GST_VIDEO_FRAME_FLAG_NONE, GST_VIDEO_FORMAT_I420, imageWidth,imageHeight, 3, m_offset, m_stride );

Furthermore the buffers (actually the luma component ) which go in as input to encoder can be displayed correctly using imshow correctly…

//cv::imshow("img",img);

I would like to try to display all Y, U and V components of the buffer which goes in as input to encoder, in aaDebug.cpp::start_feed(). Is there a way display/render image with separate Y, U and V pointers ?

Would be great if you can try this code out.

Thanks,

Hi dumbogeorge,
You do not use appsrc. Looks to be a bug in your code.

GstElement *videoSource = gst_element_factory_make("nveglstreamsrc", NULL);

Hi DaneLLL,

There are two encoders in the code. One encodes input to opencv algorithm. Another one encodes output of algorithm. The encoder code you are referring to - would encode input of opencv algorithm.

I am having issues with output encoder. Please take a look at common/aaDebug.cpp. Here the encoder uses appsrc.

m_pappsrc              = (GstAppSrc*)gst_element_factory_make("appsrc", "aa-appsrc");

The data path is like -

Camera → Queue → OCVConsumer → outputEncoderQ (in aaOCVConsumer.cpp) → This Q is popped in aaDebug.cpp::start_feed() and given to encoder.

Thanks,

Hi dumbogeorge,
1 autovideoconvert should not be required
2 Please dump one YUV frame and check via YUV viewer( such as 7yuv http://blog.datahammer.de/ )

Hi DaneLLL,

  1. It segfaults, if we remove autovideoconvert. It probably needs to be debugged - however for now we are after getting the images correctly encoded. I have checked in code for this. Please create directory named ‘yuvframes’. All YUV images will be dumped in this directory.

  2. I dump frames (YUV) just before feeding them to encoder. They look visually fine. After looking at them I see that encoded video is highly blocky - could something be wrong with encoder ? Would be great if you can reproduce this.

Thanks,

Please refer to attached a.cpp demonstrating appsrc ! omxh265enc ! filesink:

$ g++ -Wall -std=c++11  a.cpp -o test $(pkg-config --cflags --libs gstreamer-app-1.0) -ldl
$ gst-launch-1.0 videotestsrc num-buffers=1 ! video/x-raw,format=I420,width=2048,height=1080 ! filesink location=a.yuv
$ ./test
$ gst-launch-1.0 filesrc location= a.h265 ! h265parse ! omxh265dec ! nvoverlaysink

Don’t see any issue with I420 generated by videotestsrc. Please compare the YUV with yours.
a.cpp (2.46 KB)

Hi DaneLLL
Thanks for your code. If I were to encode frames captured from camera directly, may be using libargus, I guess I would need to -,

  1. Modify (such that I can read the buffer parked in memory by camera, which is subsequently modified by our opencv algorithm.
launch_stream
    << "appsrc name=mysource ! "
    << "video/x-raw,width="<< w <<",height="<< h <<",framerate=30/1,format=I420 ! "
    << "omxh265enc ! video/x-h265,stream-format=byte-stream ! "
    << "filesink location=a.h265 ";

to -

launch_stream
    << "appsrc name=mysource ! "
    << "video/x-raw(memory:NVMM),width="<< w <<",height="<< h <<",framerate=30/1,format=I420 ! "
    << "omxh265enc ! video/x-h265,stream-format=byte-stream ! "
    << "filesink location=a.h265 ";
  1. How would I run feed_function ‘on demand’ ? Rather than ever 33 ms in a loop, like you have it here -
for (int i=0; i<150; i++) {
        feed_function(nullptr);
        usleep(33333);
    }

I guess we can use ‘need-data’ signal of appsrc - is that right ?

Thanks

Hi dumbogeorge,
For Argus → gstreamer pipeline, please refer to tegra_multimedia_api\argus\samples\gstVideoEncode

Here is also a post for reference:
[url]https://devtalk.nvidia.com/default/topic/1025961/jetson-tx2/adding-overlay-to-the-tegra-camera-api-argus-quot-gstvideoencode-quot-sample/post/5219519/#5219519[/url]

appsrc can only be CPU buffers(video/x-raw).

Not sure but appsrc should be able to run in active and passive modes. May other users share experience about it.

Hi DaneLLL

Please note that I am first processing frame on CPU (i.e. accessing frames on CPU ) and then giving it to encoder via appsrc.

I am not sure, my method to map frame on CPU is right one. In code given in link above, this is how I am mapping camera from on CPU - (aaCamCapture.cpp)

char *m_datamem  = (char *)mmap(NULL, fsize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, params.offset[0]);
char *m_datamemU = (char *)mmap(NULL, fsizeU,PROT_READ | PROT_WRITE, MAP_SHARED, fd, params.offset[1]);
char *m_datamemV = (char *)mmap(NULL, fsizeV,PROT_READ | PROT_WRITE, MAP_SHARED, fd, params.offset[2]);

If I write Y in a file -

fwrite(m_datamem, sizeof(char), fize, fp)

then display image as grayscale - it comes out fine. However when I write U and V also in same file like -

fwrite(m_datamem, sizeof(char), fize, fp) 
fwrite(m_datamemU, sizeof(char), fizeU, fp) 
fwrite(m_datamemV, sizeof(char), fizeV, fp)

then display image as yuv color image, the colors do not come out well. Could there be possibility of data partially stuck in CPU caches ? In that case DDR buffer will not get updated and encoder will not pick up right data .

Do I need to be using APIs like

int NvBufferMemMap (int dmabuf_fd, unsigned int plane, NvBufferMemFlags memflag, void **pVirtAddr);
int NvBufferMemSyncForCpu (int dmabuf_fd, unsigned int plane, void **pVirtAddr);
int NvBufferMemUnMap (int dmabuf_fd, unsigned int plane, void **pVirtAddr);

As you suggested in this thread -

https://devtalk.nvidia.com/default/topic/1025494/how-to-receive-csi-camera-frame-in-unified-memory-buffer/

Thanks

Hi dumbogeorge,
I know I420/NV12 is supported in appsink from OpenCV 3.3. It only supports gray and BGR on 3.2.
Here is a post about 3.3:
[url]https://devtalk.nvidia.com/default/topic/1024245/jetson-tx2/opencv-3-3-and-integrated-camera-problems-/post/5210735/#5210735[/url]

We are not able to have experience in all cases. Other users may share their experience if any.

Hi DaneLLL,
I am not able to comprehend, the connection with OpenCV here.

I used

int NvBufferMemMap (int dmabuf_fd, unsigned int plane, NvBufferMemFlags memflag, void **pVirtAddr);
int NvBufferMemSyncForCpu (int dmabuf_fd, unsigned int plane, void **pVirtAddr);
int NvBufferMemUnMap (int dmabuf_fd, unsigned int plane, void **pVirtAddr);

and the encode corruption is gone. It is still very blocky. I suspect it is CPU cache issue. It does not show up in your code in #8, because you only always feeding same image. If you feed video from file instead of feeding a static image - then you would see block/poor quality encode.

Could you please explain - what is difference between NvBufferMemSyncForCpu() and NvBufferMemSyncForDevice() ? Which device is being referred here ? would usage of NvBufferMemSyncForCpu() guarantee that data is not stuck in CPU caches and memory buffers given out to encoder (via appsrc) will be coherent with data in CPU caches ?

Thanks

1 Like

Hi dumbogeorge,
device meand HW blocks on TX1 such as GPU, encoders.

Description of NvBufferMemSyncForDevice():

* This should be called after CPU writes to memory and before HW access it,
 * to avoid HW getting stale data from memory. In other words, before HW
 * can take over ownership of buffer from CPU.

Hi DaneLLL,

After using NvBufferMemSyncForDevice(), I am still unable to get rid of block effects in the output encoded file. I am not sure U and V buffer are properly getting flushed from CPU caches to encoder.

Would you be kind enough either to try my code in - GitHub - pcgamelore/SingleCameraPlaceholder: Sample code the read Jetson Tx1/Tx2 cameras, encode, process the frame, and encode processed output.
or feed your code given in #8, with moving images (rather than same static image) from CPU/appsrc, to see if your encoder output is proper ?

Thanks

Hi dumbogeorge, yo don’t assign pts correctly. This may generate bitstream with incorrect bitrate.

Hi DaneLLL,

I am pretty sure, an encoder like MSENC, would NOT modulate its ratecontrol/bitrate based on PTS. Anyway, I have fixed PTS update, like you were doing in your code. Please check, I have a feeling that either -

  1. encoder is getting stale data from CPU (data is still in CPU caches), when encoder reads input frames
  2. encoder is not able to make use of pitch/offsets of the NvBuffer. I use
NvBufferParams params;
       NvBufferGetParams(fd, &params);

to get parameters of buffer given by camera, and pass those values to encoder via -

gsize          m_offset[3];
    gint           m_stride[3];
    m_offset[0]    = framedata.nvBuffParams.offset[0];
    m_offset[1]    = framedata.nvBuffParams.offset[1];
    m_offset[2]    = framedata.nvBuffParams.offset[2];
    m_stride[0]    = framedata.nvBuffParams.pitch[0]; 
    m_stride[1]    = framedata.nvBuffParams.pitch[1]; 
    m_stride[2]    = framedata.nvBuffParams.pitch[2];

int size              = imageWidth * imageHeight * 1.5;
    m_pgstBuffer          = gst_buffer_new_wrapped_full( (GstMemoryFlags)0, (gpointer)(framedata.dataY), size, 0, size, NULL, NULL );
    m_pgstVideoMeta       = gst_buffer_add_video_meta_full(m_pgstBuffer,GST_VIDEO_FRAME_FLAG_NONE, GST_VIDEO_FORMAT_I420, imageWidth,imageHeight, 3, m_offset, m_stride );

Thanks

Hi dumbogeorge,
NvBuffer is not supported in gstreamer pipeline. You need to allocate a CPU buffer with size=widthxheightx1.5 for YUV420.

If you use NvBuffer, you have to use tegra_multimedia_api.

Thanks DaneLLL for quick response.

Can you please suggest how to read camera frame via argus api, (from CPU) and pass it to encoder ? My problem is camera uses some memory alignments due to with pitch != width. Also there is a gap in memory between end of Y buffer and start of U buffer. Similarly there is a gap between end of U buffer and start of V buffer.

Thanks

hi dumbogeorge,
You should run tegra_multimedia_api\samples\10_camera_recording