[MMAPI] Decode more than one H264 file, get problem when trying to modify 00_video_decode

fengxueem · May 31, 2018, 12:40pm

Hi li_lin and DaneLLL,
I am following up this topic. Now I have 28.2 on TX2, and I do the same modification to 00_video_decode as li_lin.
The code is here: (add a while loop to wrap everything and change the return statement)

int
main(int argc, char *argv[])
{
while(1) { /* I changed here !!!!!!!!!!!!!! */
    context_t ctx;
    int ret = 0;
    int error = 0;
    uint32_t current_file = 0;
    uint32_t i;
    bool eos = false;
    char *nalu_parse_buffer = NULL;
    NvApplicationProfiler &profiler = NvApplicationProfiler::getProfilerInstance();

    set_defaults(&ctx);

    if (parse_csv_args(&ctx, argc, argv))
    {
        fprintf(stderr, "Error parsing commandline arguments\n");
        return -1;
    }

    ctx.dec = NvVideoDecoder::createVideoDecoder("dec0");
    TEST_ERROR(!ctx.dec, "Could not create decoder", cleanup);

    if (ctx.stats)
    {
        profiler.start(NvApplicationProfiler::DefaultSamplingInterval);
        ctx.dec->enableProfiling();
    }

    // Subscribe to Resolution change event
    ret = ctx.dec->subscribeEvent(V4L2_EVENT_RESOLUTION_CHANGE, 0, 0);
    TEST_ERROR(ret < 0, "Could not subscribe to V4L2_EVENT_RESOLUTION_CHANGE",
               cleanup);

    if (ctx.input_nalu)
    {
        nalu_parse_buffer = new char[CHUNK_SIZE];
    }
    else
    {
        // Set V4L2_CID_MPEG_VIDEO_DISABLE_COMPLETE_FRAME_INPUT control to false
        // so that application can send chunks of encoded data instead of forming
        // complete frames. This needs to be done before setting format on the
        // output plane.
        ret = ctx.dec->disableCompleteFrameInputBuffer();
        TEST_ERROR(ret < 0,
                "Error in decoder disableCompleteFrameInputBuffer", cleanup);
    }

    // Set format on the output plane
    ret = ctx.dec->setOutputPlaneFormat(ctx.decoder_pixfmt, CHUNK_SIZE);
    TEST_ERROR(ret < 0, "Could not set output plane format", cleanup);

    // V4L2_CID_MPEG_VIDEO_DISABLE_DPB should be set after output plane
    // set format
    if (ctx.disable_dpb)
    {
        ret = ctx.dec->disableDPB();
        TEST_ERROR(ret < 0, "Error in decoder disableDPB", cleanup);
    }

    if (ctx.enable_metadata || ctx.enable_input_metadata)
    {
        ret = ctx.dec->enableMetadataReporting();
        TEST_ERROR(ret < 0, "Error while enabling metadata reporting", cleanup);
    }

    if (ctx.skip_frames)
    {
        ret = ctx.dec->setSkipFrames(ctx.skip_frames);
        TEST_ERROR(ret < 0, "Error while setting skip frames param", cleanup);
    }

    // Query, Export and Map the output plane buffers so that we can read
    // encoded data into the buffers
    if (ctx.memory_type == V4L2_MEMORY_MMAP)
        ret = ctx.dec->output_plane.setupPlane(V4L2_MEMORY_MMAP, 10, true, false);
    else if (ctx.memory_type == V4L2_MEMORY_USERPTR)
        ret = ctx.dec->output_plane.setupPlane(V4L2_MEMORY_USERPTR, 10, false, true);

    TEST_ERROR(ret < 0, "Error while setting up output plane", cleanup);

    ctx.in_file = (std::ifstream **)malloc(sizeof(std::ifstream *)*ctx.file_count);
    for (uint32_t i = 0 ; i < ctx.file_count ; i++)
    {
        ctx.in_file[i] = new ifstream(ctx.in_file_path[i]);
        TEST_ERROR(!ctx.in_file[i]->is_open(), "Error opening input file", cleanup);
    }

    if (ctx.out_file_path)
    {
        ctx.out_file = new ofstream(ctx.out_file_path);
        TEST_ERROR(!ctx.out_file->is_open(), "Error opening output file",
                   cleanup);
    }

#ifndef USE_NVBUF_TRANSFORM_API
    if (ctx.out_file || (!ctx.disable_rendering && !ctx.stats))
    {
        // Create converter to convert from BL to PL for writing raw video
        // to file
        ctx.conv = NvVideoConverter::createVideoConverter("conv0");
        TEST_ERROR(!ctx.conv, "Could not create video converter", cleanup);
        ctx.conv->output_plane.
            setDQThreadCallback(conv0_output_dqbuf_thread_callback);
        ctx.conv->capture_plane.
            setDQThreadCallback(conv0_capture_dqbuf_thread_callback);

        if (ctx.stats)
        {
            ctx.conv->enableProfiling();
        }
    }
#endif

    ret = ctx.dec->output_plane.setStreamStatus(true);
    TEST_ERROR(ret < 0, "Error in output plane stream on", cleanup);

    pthread_create(&ctx.dec_capture_loop, NULL, dec_capture_loop_fcn, &ctx);

    if (ctx.copy_timestamp && ctx.input_nalu) {
      ctx.timestamp = (ctx.start_ts * MICROSECOND_UNIT);
      ctx.timestampincr = (MICROSECOND_UNIT * 16) / ((uint32_t) (ctx.dec_fps * 16));
    }

    // Read encoded data and enqueue all the output plane buffers.
    // Exit loop in case file read is complete.
    i = 0;
    while (!eos && !ctx.got_error && !ctx.dec->isInError() &&
           i < ctx.dec->output_plane.getNumBuffers())
    {
        struct v4l2_buffer v4l2_buf;
        struct v4l2_plane planes[MAX_PLANES];
        NvBuffer *buffer;

        memset(&v4l2_buf, 0, sizeof(v4l2_buf));
        memset(planes, 0, sizeof(planes));

        buffer = ctx.dec->output_plane.getNthBuffer(i);
        if ((ctx.decoder_pixfmt == V4L2_PIX_FMT_H264) ||
                (ctx.decoder_pixfmt == V4L2_PIX_FMT_H265))
        {
            if (ctx.input_nalu)
            {
                read_decoder_input_nalu(ctx.in_file[current_file], buffer, nalu_parse_buffer,
                        CHUNK_SIZE, &ctx);
            }
            else
            {
                read_decoder_input_chunk(ctx.in_file[current_file], buffer);
            }
        }
        if (ctx.decoder_pixfmt == V4L2_PIX_FMT_VP9)
        {
            ret = read_vp9_decoder_input_chunk(&ctx, buffer);
            if (ret != 0)
                cerr << "Couldn't read VP9 chunk" << endl;
        }
        v4l2_buf.index = i;
        v4l2_buf.m.planes = planes;
        v4l2_buf.m.planes[0].bytesused = buffer->planes[0].bytesused;

        if (ctx.input_nalu && ctx.copy_timestamp && ctx.flag_copyts)
        {
          v4l2_buf.flags |= V4L2_BUF_FLAG_TIMESTAMP_COPY;
          ctx.timestamp += ctx.timestampincr;
          v4l2_buf.timestamp.tv_sec = ctx.timestamp / (MICROSECOND_UNIT);
          v4l2_buf.timestamp.tv_usec = ctx.timestamp % (MICROSECOND_UNIT);
        }

        if (v4l2_buf.m.planes[0].bytesused == 0)
        {
            if (ctx.bQueue)
            {
                current_file++;
                if(current_file != ctx.file_count)
                {
                    continue;
                }
            }
            if(ctx.bLoop)
            {
                current_file = current_file % ctx.file_count;
                continue;
            }
        }
        // It is necessary to queue an empty buffer to signal EOS to the decoder
        // i.e. set v4l2_buf.m.planes[0].bytesused = 0 and queue the buffer
        ret = ctx.dec->output_plane.qBuffer(v4l2_buf, NULL);
        if (ret < 0)
        {
            cerr << "Error Qing buffer at output plane" << endl;
            abort(&ctx);
            break;
        }
        if (v4l2_buf.m.planes[0].bytesused == 0)
        {
            eos = true;
            cout << "Input file read complete" << endl;
            break;
        }
        i++;
    }

    // Since all the output plane buffers have been queued, we first need to
    // dequeue a buffer from output plane before we can read new data into it
    // and queue it again.
    while (!eos && !ctx.got_error && !ctx.dec->isInError())
    {
        struct v4l2_buffer v4l2_buf;
        struct v4l2_plane planes[MAX_PLANES];
        NvBuffer *buffer;

        memset(&v4l2_buf, 0, sizeof(v4l2_buf));
        memset(planes, 0, sizeof(planes));

        v4l2_buf.m.planes = planes;

        ret = ctx.dec->output_plane.dqBuffer(v4l2_buf, &buffer, NULL, -1);
        if (ret < 0)
        {
            cerr << "Error DQing buffer at output plane" << endl;
            abort(&ctx);
            break;
        }

        if ((v4l2_buf.flags & V4L2_BUF_FLAG_ERROR) && ctx.enable_input_metadata)
        {
            v4l2_ctrl_videodec_inputbuf_metadata dec_input_metadata;

            ret = ctx.dec->getInputMetadata(v4l2_buf.index, dec_input_metadata);
            if (ret == 0)
            {
                ret = report_input_metadata(&ctx, &dec_input_metadata);
                if (ret == -1)
                {
                  cerr << "Error with input stream header parsing" << endl;
                }
            }
        }

        if ((ctx.decoder_pixfmt == V4L2_PIX_FMT_H264) ||
                (ctx.decoder_pixfmt == V4L2_PIX_FMT_H265))
        {
            if (ctx.input_nalu)
            {
                read_decoder_input_nalu(ctx.in_file[current_file], buffer, nalu_parse_buffer,
                        CHUNK_SIZE, &ctx);
            }
            else
            {
                read_decoder_input_chunk(ctx.in_file[current_file], buffer);
            }
        }
        if (ctx.decoder_pixfmt == V4L2_PIX_FMT_VP9)
        {
            ret = read_vp9_decoder_input_chunk(&ctx, buffer);
            if (ret != 0)
                cerr << "Couldn't read VP9 chunk" << endl;
        }
        v4l2_buf.m.planes[0].bytesused = buffer->planes[0].bytesused;

        if (ctx.input_nalu && ctx.copy_timestamp && ctx.flag_copyts)
        {
          v4l2_buf.flags |= V4L2_BUF_FLAG_TIMESTAMP_COPY;
          ctx.timestamp += ctx.timestampincr;
          v4l2_buf.timestamp.tv_sec = ctx.timestamp / (MICROSECOND_UNIT);
          v4l2_buf.timestamp.tv_usec = ctx.timestamp % (MICROSECOND_UNIT);
        }

        if (v4l2_buf.m.planes[0].bytesused == 0)
        {
            if (ctx.bQueue)
            {
                current_file++;
                if(current_file != ctx.file_count)
                {
                    continue;
                }
            }
            if(ctx.bLoop)
            {
                current_file = current_file % ctx.file_count;
                continue;
            }
        }
        ret = ctx.dec->output_plane.qBuffer(v4l2_buf, NULL);
        if (ret < 0)
        {
            cerr << "Error Qing buffer at output plane" << endl;
            abort(&ctx);
            break;
        }
        if (v4l2_buf.m.planes[0].bytesused == 0)
        {
            eos = true;
            cout << "Input file read complete" << endl;
            break;
        }
    }

    // After sending EOS, all the buffers from output plane should be dequeued.
    // and after that capture plane loop should be signalled to stop.
    while (ctx.dec->output_plane.getNumQueuedBuffers() > 0 &&
           !ctx.got_error && !ctx.dec->isInError())
    {
        struct v4l2_buffer v4l2_buf;
        struct v4l2_plane planes[MAX_PLANES];

        memset(&v4l2_buf, 0, sizeof(v4l2_buf));
        memset(planes, 0, sizeof(planes));

        v4l2_buf.m.planes = planes;
        ret = ctx.dec->output_plane.dqBuffer(v4l2_buf, NULL, NULL, -1);
        if (ret < 0)
        {
            cerr << "Error DQing buffer at output plane" << endl;
            abort(&ctx);
            break;
        }

        if ((v4l2_buf.flags & V4L2_BUF_FLAG_ERROR) && ctx.enable_input_metadata)
        {
            v4l2_ctrl_videodec_inputbuf_metadata dec_input_metadata;

            ret = ctx.dec->getInputMetadata(v4l2_buf.index, dec_input_metadata);
            if (ret == 0)
            {
                ret = report_input_metadata(&ctx, &dec_input_metadata);
                if (ret == -1)
                {
                  cerr << "Error with input stream header parsing" << endl;
                  abort(&ctx);
                  break;
                }
            }
        }
    }

    // Signal EOS to the decoder capture loop
    ctx.got_eos = true;
#ifndef USE_NVBUF_TRANSFORM_API
    if (ctx.conv)
    {
        ctx.conv->capture_plane.waitForDQThread(-1);
    }
#endif

    if (ctx.stats)
    {
        profiler.stop();
        ctx.dec->printProfilingStats(cout);
#ifndef USE_NVBUF_TRANSFORM_API
        if (ctx.conv)
        {
            ctx.conv->printProfilingStats(cout);
        }
#endif
        if (ctx.renderer)
        {
            ctx.renderer->printProfilingStats(cout);
        }
        profiler.printProfilerData(cout);
    }

cleanup:
    if (ctx.dec_capture_loop)
    {
        pthread_join(ctx.dec_capture_loop, NULL);
    }
#ifndef USE_NVBUF_TRANSFORM_API
    if (ctx.conv && ctx.conv->isInError())
    {
        cerr << "Converter is in error" << endl;
        error = 1;
    }
#endif
    if (ctx.dec && ctx.dec->isInError())
    {
        cerr << "Decoder is in error" << endl;
        error = 1;
    }

    if (ctx.got_error)
    {
        error = 1;
    }

    // The decoder destructor does all the cleanup i.e set streamoff on output and capture planes,
    // unmap buffers, tell decoder to deallocate buffer (reqbufs ioctl with counnt = 0),
    // and finally call v4l2_close on the fd.
    delete ctx.dec;
#ifndef USE_NVBUF_TRANSFORM_API
    delete ctx.conv;
#endif
    // Similarly, EglRenderer destructor does all the cleanup
    delete ctx.renderer;
    for (uint32_t i = 0 ; i < ctx.file_count ; i++)
      delete ctx.in_file[i];
    delete ctx.out_file;
#ifndef USE_NVBUF_TRANSFORM_API
    delete ctx.conv_output_plane_buf_queue;
#else
    if(ctx.dst_dma_fd != -1)
    {
        NvBufferDestroy(ctx.dst_dma_fd);
        ctx.dst_dma_fd = -1;
    }
#endif
    delete[] nalu_parse_buffer;
    free (ctx.in_file);
    for (uint32_t i = 0 ; i < ctx.file_count ; i++)
      free (ctx.in_file_path[i]);
    free (ctx.in_file_path);
    free(ctx.out_file_path);
    if (error)
    {
        cout << "App run failed" << endl;
    }
    else
    {
        cout << "App run was successful" << endl;
    }
} /* I changed here !!!!!!!!!!!!!! */
    return 0; /* I changed here !!!!!!!!!!!!!! */
}

And I run the program like:

./video_decode H264 /home/nvidia/Videos/car_1080p_10fps.h264

Basically, I want it to repeat decode and render one h264 file over and over again. But after 1 st iteration, it won’t work as intended and emit errors:

nvbuf_utils: nvbuffer Payload Type not supported
NvBufferGetParams failed for src_dmabuf_fd
nvbuffer_transform Failed
Transform failed
nvbuf_utils: nvbuffer Payload Type not supported
NvBufferGetParams failed for src_dmabuf_fd
nvbuffer_transform Failed
Transform failed
nvbuf_utils: nvbuffer Payload Type not supported
NvBufferGetParams failed for src_dmabuf_fd
nvbuffer_transform Failed
Transform failed
nvbuf_utils: nvbuffer Payload Type not supported
NvBufferGetParams failed for src_dmabuf_fd
nvbuffer_transform Failed
Transform failed
nvbuf_utils: nvbuffer Payload Type not supported
NvBufferGetParams failed for src_dmabuf_fd
nvbuffer_transform Failed
Transform failed
...

I have to ctrl+c it. So, what’s going wrong here?

And if I want to decode more than one h264 file, how can I work it out? Now I can think of 2 ways to do so, as a work around to the problem mentioned above:
Guess 1: I should put them in a list and then pass them one by one to the NvVideoDecoder just like this sample did, but never clean up decoder and converter. Right?
Guess 2: I can create multiple pairs of decoder and converter, each pair must work on the main thread(NvVideoConverter cannot be created on a thread by ctx.conv = NvVideoConverter::createVideoConverter(“conv0”); Right?), but this method won’t be able to work if we create more than 8 pairs of decoder and converter, which is because that 16 video devices is the default upper limit for MMAPI. Right?

Or, could you please kindly give comments on my guesses or give a hint? A big thx! :) :) :)

And one more question, how can I differ from file A to file B if I can decode more than one file via Guess 1?

carolyuu · June 1, 2018, 6:23am

Hi fengxueem,

We tried to reproduced your issue first, copy your code to “video_decode_main.cpp”, but make failed.
What steps am I missing? or could you share your full code?

Thanks!

fengxueem · June 4, 2018, 1:49am

Hi carolyuu,

Sorry to say this, but my boss just let me switch to another project and hand over TX2 to my co-workers. I cannot get access to the source code right now. But I think I just make three lines changed, if you can find them at the previous post:

line 4: while(1) { /* I changed here !!! */
line 425: } /* I changed here !!! */
line 426: return 0; /* I changed here !!! */

And everything else just stays the same as 00_video_decode sample pre-installed by JetPack 3.2. I will post the full code ASAP. If you still cannot reproduce the problem, then maybe it’s because of the compiler, I use GCC Tool Chain Sources for 64-bit BSP version 28.2 release data 2018/03/08 downloaded from “https://developer.nvidia.com/embedded/downloads#?search=cross”.
Still, thx for your reply. You guys are really helpful, learnt a lot from this forums.

DaneLLL · June 4, 2018, 6:12am

Hi fengxueem,
For decoding multiple streams, please refer to tegra_multimedia_api\samples\backend

Please disable TensorRT if you don’t need it.
In backend/Makefile

ENABLETRT ?= 0

DaneLLL · June 11, 2018, 2:01am

Please apply attached prebuilt lib on r28.2 and give it a try.

libtegrav4l2.so.txt (155 KB)

cjwti · October 15, 2018, 9:55pm

Hi DaneLLL,

I am not the OP of this thread, but I was seeing the same behavior he mentioned when trying to destroy/recreate the decoder several times. Thanks for posting that library file in #5. I can confirm that it seems to fix this issue in preliminary (short duration) testing. Do you know if this issue is also fixed in any of the later L4T releases, or if I will have trouble if I update to a newer L4T release and want to use the fix from this library?

Thanks,

Chris Richardson

DaneLLL · October 18, 2018, 6:32am

Hi Chris, the issue is also present on r28.2.1. If you use r28.2.1, please also apply the prebuilt lib.

cjwti · October 22, 2018, 4:48pm

Hi DaneLLL,

Thanks for the information; I will carry this library locally for the time being. If possible, please update this thread when a fix is included with a newer version of L4T.

Thanks,

Chris Richardson