Recap on tensorflow object detection API on TX2
Hi users, I just wanted to summarize developers experience and sharing some tips about tensorflow object detection API on TX2. At the moment I am just talking about what is actually doable and not, with a focus on inference, rather than training. Browsing the forum, my experience and other resources, this is what I understood. Let's consider only available pretrained frozen graph. TF obj.det. API can be used for inference with ssd_mobilenet_v1 network architecture at approx ~5-8 fps. Faster Rcnn resnet pretrained models seems to cause OOM errors (in my experience, all of them). Was anybody able to run one of the Faster Rcnn resnet model? If yes, could you share some tips? Second question. Does the conversion to TensorRT have an impact also on memory usage? What I mean is the following: even if I am not able to run a specific network architecture due to OOM error on TX2 from TF Obj Det API, I could potentially train a model in a different, more powerful machine, export trained graph to UFF format (through python API), then transfer it to TX2 where it can be imported using C++ API for inference. Does it sound correct? performance would certainly benefit from a TF->TensorRT conversion, but I am not sure about memory usage. I am considering this as an option beacuse I've notice in jetson-inference DetectNet a FasterRcnnResnet50 network. Thanks for your contribution!
Hi users,
I just wanted to summarize developers experience and sharing some tips about tensorflow object detection API on TX2.
At the moment I am just talking about what is actually doable and not, with a focus on inference, rather than training.
Browsing the forum, my experience and other resources, this is what I understood.
Let's consider only available pretrained frozen graph.

TF obj.det. API can be used for inference with ssd_mobilenet_v1 network architecture at approx ~5-8 fps. Faster Rcnn resnet pretrained models seems to cause OOM errors (in my experience, all of them). Was anybody able to run one of the Faster Rcnn resnet model? If yes, could you share some tips?

Second question.
Does the conversion to TensorRT have an impact also on memory usage?
What I mean is the following: even if I am not able to run a specific network architecture due to OOM error on TX2 from TF Obj Det API, I could potentially train a model in a different, more powerful machine, export trained graph to UFF format (through python API), then transfer it to TX2 where it can be imported using C++ API for inference. Does it sound correct? performance would certainly benefit from a TF->TensorRT conversion, but I am not sure about memory usage.
I am considering this as an option beacuse I've notice in jetson-inference DetectNet a FasterRcnnResnet50 network.

Thanks for your contribution!

#1
Posted 01/12/2018 04:41 PM   
Hi, Thanks for the sharing. We are also checking TensorFlow object detection API. Appreciated for sharing your experience with us. Although TensorFlow can run [color="gray"][i]ssd_mobilenet_v1[/i][/color] with GPU mode correctly, we find the GPU utilization is pretty low. Do you also meet this issue? Could you share the tegrastats data when you inference with the [color="gray"][i]ssd_mobilenet_v1[/i][/color]? [code]sudo ./tegrastats[/code] For your second question: [b]1.[/b] Workflow is correct. Only concern is that we have yet to support the custom API for UFF user. If there is a non-supported layer in your model, there is no WAR to run this layer with TensorRT. [b]2.[/b] TensorRT support fp16 mode which can cut memory in half and it will be extremely helpful for your use case. Thanks.
Answer Accepted by Forum Admin
Hi,

Thanks for the sharing.

We are also checking TensorFlow object detection API.
Appreciated for sharing your experience with us.

Although TensorFlow can run ssd_mobilenet_v1 with GPU mode correctly, we find the GPU utilization is pretty low.
Do you also meet this issue?
Could you share the tegrastats data when you inference with the ssd_mobilenet_v1?
sudo ./tegrastats


For your second question:
1. Workflow is correct. Only concern is that we have yet to support the custom API for UFF user.
If there is a non-supported layer in your model, there is no WAR to run this layer with TensorRT.

2. TensorRT support fp16 mode which can cut memory in half and it will be extremely helpful for your use case.

Thanks.

#2
Posted 01/15/2018 03:29 AM   
Hi AastaLLL, I will soon be looking into Tensorflow object detection API with TensorRT (for TX2). some models of interest are : ssd_mobilenet_v1 ssd_inception_v2 faster_rcnn_inception_v2 Do you have any links specific to the tensorflow Object detection API TensorRT to get me started? Thanks vtaranti
Hi AastaLLL,


I will soon be looking into Tensorflow object detection API with TensorRT (for TX2).

some models of interest are :

ssd_mobilenet_v1
ssd_inception_v2
faster_rcnn_inception_v2


Do you have any links specific to the tensorflow Object detection API TensorRT to get me started?

Thanks
vtaranti

#3
Posted 02/19/2018 08:19 PM   
Hi, It's recommended to check if the layers of your model are well-supported by TensorRT first. We have listed the supported layer for UFF parser and TensorRT engine in detail here: UFF parser: [url]http://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#tfops[/url] TensorRT engine: [url]http://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#layers[/url] Thanks.
Hi,

It's recommended to check if the layers of your model are well-supported by TensorRT first.

We have listed the supported layer for UFF parser and TensorRT engine in detail here:
UFF parser: http://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#tfops
TensorRT engine: http://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#layers

Thanks.

#4
Posted 02/21/2018 09:10 AM   
Hi guys, are there any new projects like (https://github.com/NVIDIA-Jetson/tf_to_trt_image_classification) but for object detection models using TensorRT? When talking about performance of the Object detection API: I was working on this topic over the last 4 months, also started at around 4 fps for mobilenet ssd, But now i am able to achieve up to 30 fps with the same model on the jetson, you can have a look at my github and try it out (https://github.com/GustavZ/realtime_object_detection) What i am now interested in is: Makin Mask R-CNN run on the jetson. Did anybody get it working on the jetson successfully? Maybe through compressing/binarization techniques? Is it documented which layers lack TensorRT for Mask R-CNN? Probably a lot... Anyways would be nice to hear about your experience
Hi guys,

are there any new projects like (https://github.com/NVIDIA-Jetson/tf_to_trt_image_classification) but for object detection models using TensorRT?

When talking about performance of the Object detection API:
I was working on this topic over the last 4 months, also started at around 4 fps for mobilenet ssd,
But now i am able to achieve up to 30 fps with the same model on the jetson, you can have a look at my github and try it out (https://github.com/GustavZ/realtime_object_detection)

What i am now interested in is: Makin Mask R-CNN run on the jetson. Did anybody get it working on the jetson successfully? Maybe through compressing/binarization techniques? Is it documented which layers lack TensorRT for Mask R-CNN? Probably a lot...

Anyways would be nice to hear about your experience

Follow my Jetson Projects: https://github.com/GustavZ

#5
Posted 04/06/2018 08:28 AM   
Hi, Sorry for that we are not familiar with Mask R-CNN. But you can find the detail supported layer of TensorRT 4 here: [url]http://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#layers[/url] Thanks.
Hi,

Sorry for that we are not familiar with Mask R-CNN.
But you can find the detail supported layer of TensorRT 4 here:
http://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#layers

Thanks.

#6
Posted 04/10/2018 06:58 AM   
Scroll To Top

Add Reply