Faster RCNN and variable input size

Hello,

I have successfully used TensorRT 6 to optimize and run a FasterRCNN model with input size 1000 x 600 with a static TRT engine.
It works fine.

However, my images can have size and aspect ratio different than 1000 x 600.
I have two cases:
1- the size of the video stream has a different aspect ratio
2- the processed image is the result of a previous detection algorithm.
The size of the detected objets always changes.
As a result, I need to detect object on these “sub images” of different size and aspect ratio.

In both cases, if I just resize my images to 1000 x 600, my precision decreases because objects are warped.

For example, in Tensorflow object detection module, we can use a resizer called “keep aspect ratio”, which computes the resized dimensions so that

  • the aspect ratio of the image is kept
  • the resized dimensions are in a predefined range [min, max]
    This is possible since the FasterRCNN algorithm can be feed with any input image size.
    This can be done for training and at inference time.
    As a result, the input sizes 1000 and 600 are not input sizes, but min / max input sizes.

As a result, I have a CNN for which the input image dimensions can change.
I read about dynamic shapes in the samples. But if I understood well, the sample just create a “resize engine”, and the CNN input size is fixed.

So my question is: is it possible to have a FasterRCNN TRT engine, which can be fed with any input size (set at inference time), as explained above ? How can I handle the aspect ratio problem ? Did I miss the solution in the samples ?

Hi,

Please refer to below links:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#work_dynamic_shapes
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/sampleDynamicReshape

I will also recommend you to use TRT 7:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-700/tensorrt-release-notes/tensorrt-7.html#rel_7-0-0

You can use “trtexec” command line tool to understand performance and possibly locate bottlenecks.
Your ONNX network definition must be created with the explicitBatch flag set.

Please find the below links for your reference:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#trtexec
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec#example-4-running-an-onnx-model-with-full-dimensions-and-dynamic-shapes

Thanks