yolov3 FPS on Xavier

Hi,

The enviroment we used is deepstream4.0 on Xavier(16G), the OS is ubuntu18.04 and GPU frequency has set to be highest.

We run the official yolov3 model followed as the tutorial, by using the command “deepstream-app -c deepstream_app_config_yoloV3.txt”, there’s only 18 FPS, and the GPU utilization is 99%.(we use FP16 precision)

The FLOPS computed by the darknet of yolov3 is 60BFLOPS, and the Xavier(16G) has 11TFLOPS(FP16), so, it should has a far higher FPS of yolov3 on Xavier.

Are there any special optimization techniques should be used or any configurations we may not set properly?

thx,
ybpei

Hi,

Have you maximized the device performance first?

sudo nvpmodel -m 0
sudo jetson_clocks

Thanks.

Yes, we have maximized the device performance, but the FPS is 18, far lower than expected.

Hi,

Would you mind to try INT8 mode to see if helps?

Thanks.

I am running on INT8 with yolov3 on deepstream 4.0 using xavier and am currently at 27fps. If I run this same cuda optimized yolov3 model on my rtx 2070 (fp16) pc i get around 33fps using python and about 60fps on C++. Is this expected? I would have thought that Xavier would be running yolov3 at least at par with RTX2070 at INT8 mode.

Hi,

We got around 26 fps on the Jetson Xavier.

Please noticed that Xavier are in the same generation of RTX2070.
With limited resource in cores and bandwidth (embedded system), the result of YOLO looks reasonable to me.

There are some way to improve the pipeline from model side.
You can try to reduce the model input size to get some improvement in the inference time.

Edit yolov3.cfg

width=416
height=416

Thanks.

Hi Aastall,

I do agree reducing the input size does work but then you get to sacrifice the accuracy as well on smaller objects. I notice that even at 608, the accuracy isn’t as good on INT8 as it is on FP16 already which I guess shouldn’t really be a surprise.

Many thanks though for taking the time to do the test. At least your results matching mine tells me I am not messing this up.

Hi,

Please noticed that INT8 operation requires calibration beforehand.
[url]Sample Support Guide :: NVIDIA Deep Learning TensorRT Documentation

This should be applied once the model or platform changed.
I’m not sure if the calibrated file in the release is calibrated on the Xavier.
It’s recommended to apply the process first.

Thanks.

Hi Aastall,

Would you be able guide me on how i can go about this? Is there a sample of creating a calibration table for Yolov3 in Xavier? The link you provided gives an overview of what calibration is and samples on MNIST but not sure everything would work on YOLOv3 for Xavier. If there is a sample using the deepstream package would also be great.

Hi Aastall,

Any updates on how i can create my own calibration table for YOLO running on Xavier?

Hi,

For TensorRT5.1, you will need to use IInt8EntropyCalibrator2 to calibrate YOLO model.
Here is a relevant topic which can give you more information:
[url]TensorRT YOLO Int8 on GTX 1080ti - TensorRT - NVIDIA Developer Forums

Thanks.

Hi AastaLL,

Thanks for the guidance. I will look into this after I close out a couple of projects i have ongoing.

Hi Astall:

I’m doing a similar project using yolo with deepstream on jetson xavier. I’ve heard from one of your colleagues that deepstream’s sample yolo is only optimized for tegra series, not jetson xavier.
you can see the info here:

the article suggests that because deepstream’s yolo is not optimized for jetson xavier, the frame rate would be low. you said that in NVIDIA you could achieve a frame rate of around 26 fps, how ever the link above suggests that by using the “nvyolo” plugin, Yolov3 can be run at around 50fps. I am a little bit confused?!!

Hi ralvarezl8d7l,

I modify yolov3.cfg so I need to create a new calibration table for it.
Are yor success to get a new calibration tabel?
Can you guide me how to do this?

Thanks.

Hi mic1123key5,

Sorry I unfortunately haven’t had the time to work on this again yet.