segfault when using libnvjpeg to decode

I’m trying to use libnvjpeg directly to decode some JPEG image.

I don’t want to use gstreamer because neither jpegdec nor nvjpegdec is able to give me the decoded image in the original pixel format.

I have a JPEG image that is 1920*1080, encoded as YUV422 planar.

So I wrote a test program that will decode using jpeg_read_raw_data() calls. It works fine when I compile and link it with the regular libjpeg, but segfaults when I compile and link it with libnvjpeg.

Here is the test image: https://www.dropbox.com/s/gacnmdyrnufgbol/yuv422_planar.jpg?dl=0

And here is the source code:

/**
 * To compile:
 * 1. Download the gstjpeg source tar ball:
 *    $ cd && wget http://developer.nvidia.com/embedded/dlc/l4t-Jetson-TK1-Gstjpeg-Sources-R21-5
 * 2. Untar it:
 *    $ tar -xjf gstjpeg_src.tbz2
 * 3. Compile the test program:
 *    $ g++-4.8 -std=c++11 -o tegrajpeg_raw_accelerated tegrajpeg_raw.cpp -I/home/ubuntu/gstjpeg_src/nv_headers -L/usr/lib/arm-linux-gnueabihf/tegra/ -lnvjpeg -O3
 *    or
 *    $ g++-4.8 -std=c++11 -o tegrajpeg_raw tegrajpeg_raw.cpp -ljpeg -O3
 */

/* Standard library */
#include <chrono>
#include <cstring>
#include <fstream>
#include <iostream>
#include <iterator>
#include <memory>
#include <string>
#include <vector>

/* libjpeg */
#include "jpeglib.h"

using namespace std;
using namespace ::std::chrono;

vector<char> read_jpg_file() {
    string filepath{"/home/ubuntu/yuv422_planar.jpg"};
    vector<char> jpg_buffer;

    ifstream jpg_file(filepath, ios::binary | ios::ate);

    if (jpg_file.is_open()) {
        jpg_buffer.reserve(jpg_file.tellg());

        jpg_file.seekg(0, ios::beg);

        jpg_buffer.assign(istreambuf_iterator<char>(jpg_file), istreambuf_iterator<char>());
        jpg_file.close();
    }
    else {
        cerr << "Error reading yuv422_planar.jpg!" << endl;
    }

    return jpg_buffer;
}

int main(int argc, char** argv) {
    // read a jpeg file
    vector<char> jpg_buffer = read_jpg_file();  //<char> because ifstream works with char, not unsigned char. :(
    unsigned char* output_buffer = new unsigned char[1920 * 1080 * 2];

    int i, j;

    unsigned char**lines[3];
    unsigned char*y[4 * DCTSIZE] = {NULL, };
    unsigned char*u[4 * DCTSIZE] = {NULL, };
    unsigned char*v[4 * DCTSIZE] = {NULL, };
    int v_samp[3];
    unsigned char *base[3], *last[3];

    // How many bytes per row of Y, U, and V data.
    const int stride[3] = {1920, 960, 960}; 

    const unsigned int height{1080};

    lines[0] = y;
    lines[1] = u;
    lines[2] = v;

    struct jpeg_decompress_struct cinfo;
    struct jpeg_error_mgr jerr;

    cinfo.err = jpeg_std_error(&jerr);
    jpeg_create_decompress(&cinfo);

    jpeg_mem_src(&cinfo, reinterpret_cast<unsigned char*>(jpg_buffer.data()), jpg_buffer.size());

    jpeg_read_header(&cinfo, true);
    
    cinfo.raw_data_out = 1;

    jpeg_start_decompress(&cinfo);

    v_samp[0] = cinfo.comp_info[0].v_samp_factor;
    v_samp[1] = cinfo.comp_info[1].v_samp_factor;
    v_samp[2] = cinfo.comp_info[2].v_samp_factor;

    // Starting positions of the Y, U, and V components in the output_buffer
    base[0] = output_buffer;            // Y
    base[1] = base[0] + 1920 * 1080;    // U
    base[2] = base[1] + 960 * 1080;     // V
    
    for (i = 0; i < height; i += v_samp[0] * DCTSIZE) { 
        for (j = 0; j < (v_samp[0] * DCTSIZE); j++) {
            lines[0][j] = base[0] + (i + j) * stride[0];
            lines[1][j] = base[1] + (i + j) * stride[1];
            lines[2][j] = base[2] + (i + j) * stride[2];
        }

        jpeg_read_raw_data(&cinfo, lines, v_samp[0] * DCTSIZE);
    }

    jpeg_finish_decompress(&cinfo);

    ofstream out_yuv{"/home/ubuntu/out.yuv", ::std::ios::out | ::std::ios::binary};
    out_yuv.write(reinterpret_cast<const char*>(output_buffer), 1920 * 1080 * 2);
    out_yuv.close();

    jpeg_destroy_decompress(&cinfo);

    delete[] output_buffer;
}

Could somebody take a look and tell me why the libnvjpeg version would segfault? I looked at the gstreamer plug-in source code and I don’t see anything special there, so I don’t understand why it segfaults when I try to use libnvjpeg in my own code.

This is not a valid case we support on TK1. Please other users share experience.

In addition, I know that the gstreamer nvjpegdec (and hence libnvjpeg) can decode this image. I have tried using gst-launch-1.0 with nvjpegdec to decode the image and it works, although it returns the image data downsampled to I420 instead of Y42B (YUV422 planar).

I got my code working.

It looks like the implementation of jpeg_mem_src() is a bit broken in libnvjpeg.so. It creates a jpeg_source_mgr that is invalid for use with jpeg_read_raw_data(), causing segfaults as jpeg_read_raw_data() tries to copy jpeg data to be decoded.

So I had to manually create my own jpeg_source_mgr and implement my own init_source(), fill_input_buffer(), skip_input_data(), resync_to_restart(), and term_source() functions.

Unfortunately, the performance is still not great. It still takes 30-35 ms to decode a 1920*1080 image, whereas I know that the gstreamer plugin can do much faster than that by leveraging NVMM.

Thank you so much for your sharing. I’ve made it work on jetson nano. For those who ref this code for yuv422p decoding, simply adding

cinfo.do_fancy_upsampling = FALSE;
cinfo.do_block_smoothing = FALSE;

after

jpeg_read_header(&cinfo, true)

Because there’s some default settings being set in jpeg_read_header in nvjpeg I guess.