How to load and deserialize the .engine file?

My device is Jetson TX2. I used the deepstream to accelerate my yolov3-tiny model. The deepstream sample helped me generate an .engine file. I want to load and deserialize this .engine file with Tensor RT C++ API.
I referenced this file:
https://github.com/dusty-nv/jetson-inference/blob/e168b746d540fd739a6fa6ba652b5a5e8c6ffa08/tensorNet.cpp
However, I met these prolems when deserializing the .engine file:
could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin YoloLayerV3_TRT version 1 namespace
Cannot deserialize plugin YoloLayerV3_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin LReLU_TRT version 1 namespace
Cannot deserialize plugin LReLU_TRT
getPluginCreator could not find plugin YoloLayerV3_TRT version 1 namespace
Cannot deserialize plugin YoloLayerV3_TRT
Press to close this window…

Here are my codes:
[b] bool mEnableFP16 = true;
bool mOverride16 = false;
std::stringstream gieModelStream;
gieModelStream.seekg(0, gieModelStream.beg);

const char* cache_path = "/home/nvidia/Documents/Documents/deepstreamYOLO/tensorrtengine/model_b1_fp16.engine";
std::ifstream cache(cache_path);

if( !cache )
{
    cout<<"file doesn't exist!"<<endl;
}
else
{
    cout<< "loading network profile from cache..."<<endl;
    gieModelStream << cache.rdbuf();
    cache.close();
    // test for half FP16 support
    nvinfer1::IBuilder* builder = CREATE_INFER_BUILDER(gLogger);

    if( builder != NULL )
    {
        mEnableFP16 = !mOverride16 && builder->platformHasFastFp16();
        printf( "platform %s FP16 support.\n", mEnableFP16 ? "has" : "does not have");
        builder->destroy();
    }
}

printf("%s loaded\n", cache_path);

/*
 * create runtime inference engine execution context
 */
nvinfer1::IRuntime* infer = CREATE_INFER_RUNTIME(gLogger);

if( !infer )
{
    printf("failed to create InferRuntime\n");
    return 0;
}

// support for stringstream deserialization was deprecated in TensorRT v2
// instead, read the stringstream into a memory buffer and pass that to TRT.
gieModelStream.seekg(0, std::ios::end);
const int modelSize = gieModelStream.tellg();
gieModelStream.seekg(0, std::ios::beg);
printf("the model size is %d\n",modelSize);
void* modelMem = malloc(modelSize);

if( !modelMem )
{
    printf("failed to allocate %i bytes to deserialize model\n", modelSize);
    return 0;
}

gieModelStream.read((char*)modelMem, modelSize);
nvinfer1::ICudaEngine* engine = infer->deserializeCudaEngine(modelMem, modelSize, NULL);
free(modelMem);

if( !engine )
{
    printf("failed to create CUDA engine\n");
    return 0;
}[/b]

Hi,

The error indicates that TensorRT cannot find the corresponding plugin implementation.

Cannot deserialize plugin LReLU_TRT
Cannot deserialize plugin YoloLayerV3_TRT

Since some YOLO layers are not supported by the TensorRT, there are some plugin implementations for building YOLO model:
https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/blob/master/yolo/lib/plugin_factory.cpp#L34

Please also add the plugin constructor when deserializing:
https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/blob/3a8957b2d985d7fc2498a0f070832eb145e809ca/yolo/lib/trt_utils.cpp#L249

nvinfer1::ICudaEngine* engine = runtime->deserializeCudaEngine(modelMem, modelSize, <b>pluginFactory</b>);

Thanks.