Make all tensorrt optimizations compatible with 3D convolution

damien.menigaux · March 9, 2020, 3:33pm

Hi.

With TensorRT 7.0 comes compatibility with 3D convolution. But 8 bit integer quantization still isn’t available for 3D convolution, as shown here, section “Layer and precision” : Support Matrix :: NVIDIA Deep Learning TensorRT Documentation
However, it’s a huge part of performance gains.

In fact, we should be able to look at a 15 folds performance gain with TensorRT (based on what I obtain with 2D models on various hardware) whereas, with what is available right now, I could only obtain a *1.5 folds.

Are there plans to make the rest of TensorRT optimizations available for 3D convolution ?

SunilJB · March 9, 2020, 4:50pm

Hi,

Please keep watching below TensorRT links for any new announcements or release related details:
https://developer.nvidia.com/tensorrt
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/index.html

Thanks

damien.menigaux · June 4, 2020, 8:11am

I see it is still not the case. Is there any plans to allow quantization and other optimizations for 3d conv in TensorRT ?

copah · June 10, 2020, 10:14pm

@SunilJB

Any updates on the roadmap for optimizing 3D convolutions?

damien.menigaux · September 3, 2020, 2:40pm

Hi. Any updates on making 3d convlutions compatible with all optimizations ?

damien.menigaux · April 9, 2021, 1:47pm

Still very interested. Has there been any progress on this ?

damien.menigaux · April 15, 2021, 3:36pm

There has been progress. Since Tensorrt 7.2, Tensorcore can be used to speedup INT8 inference of 3d conv layers. This provides some speedup on select GPUs.

See 3D layers show no speedup in INT8 on cuda cores · Issue #1176 · NVIDIA/TensorRT · GitHub

But not all optimizations are available yet. There is still no speedup with INT8 on Pascal GPUs, and all gpus without tensor cores, whereas all GPUs see huge speedup on 2D models when comparing INT8 vs FP16.