I am attempting to implement YOLOv3 Tiny on the PX2, but have been running into a lot of issues. Originally, I was trying to get Darknet and OpenCV working with the GSML cameras, but abandoned that route to try to work with NVMEDIA and DRIVEWORKS APIs instead. From my understanding, this can be accomplished by implementing the network in a supported framework, in this case TensorFlow, and then generating a TensorRT runtime engine for use with DRIVEWORKS. I came across an implementation of YOLOv3 Tiny on github (https://github.com/khanhhhh/tiny-yolo-tensorflow) and successfully trained the network on COCO and froze the graph. I’m currently stuck at the stage of trying to generate a UFF model from this frozen graph as I am getting the following error:
Traceback (most recent call last):
File "./pbTouff.py", line 3, in <module>
uff.from_tensorflow_frozen_model("./output_graph.pb", output_nodes=["TRAINER/h"], preprocessor=None, output_filename="yolov3-tiny.uff")
File "/home/reach/.local/lib/python3.6/site-packages/uff/converters/tensorflow/conversion_helpers.py", line 149, in from_tensorflow_frozen_model
return from_tensorflow(graphdef, output_nodes, preprocessor, **kwargs)
File "/home/reach/.local/lib/python3.6/site-packages/uff/converters/tensorflow/conversion_helpers.py", line 62, in from_tensorflow
gs.extras.process_softmax(dynamic_graph)
File "/home/reach/.local/lib/python3.6/site-packages/graphsurgeon/extras.py", line 112, in process_softmax
node_chains = dynamic_graph.find_node_chains_by_op(op_chain)
File "/home/reach/.local/lib/python3.6/site-packages/graphsurgeon/StaticGraph.py", line 239, in find_node_chains_by_op
matching_chain = find_matching_chain(node, chain)
File "/home/reach/.local/lib/python3.6/site-packages/graphsurgeon/StaticGraph.py", line 224, in find_matching_chain
input_node = self.node_map[input_name]
KeyError: 'TRAINER/split:1'
Makefile:10: recipe for target 'uff' failed
make: *** [uff] Error 1
After scouring the forums it would appear that this is an issue with the tf.split layers in the graph as these are not currently supported (per this post: https://devtalk.nvidia.com/default/topic/1033744/jetson-tx2/can-tf-split-layer-be-converted-to-tensorrt-/). The moderator did give me one glimmer of hope still as they said that unsupported layers could be implemented in TensorRT with NVIDIA’s Plugin API. I did spend a bit of time looking at the documentation and examples, but I still don’t really understand how I would go about implementing this through the Plugin API. It seems like it should be relatively simple as all this would be doing is splitting the input tensor into multiple output tensors, however I’m not really sure where I should begin. I guess what I am looking for is confirmation as to whether I am even on the right track as well as a nudge in the right direction for figuring out how to implement this layer myself. Does NVIDIA (or anyone on the forum) have tutorials that could walk me through implementing a custom TensorRT layer for TensorFlow? Thanks everyone for your time and assistance.
Dear maxe2470,
If you are using DrivePX2, the latest PDK has driveworks 1.2 which does not support tensorRT customer layer plugins. Note that we have no more releases targeted on DrivePX2.
But on Drive AGX platforms, the latest PDK has driveworks 1.5 which supports tensorRT custom layer plugin. We have a sample(sample_dnn_plugin) to demonstrate this. Also, we have included YOLOv3 ONNX based sample as part of TensorRT(https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#yolov3_onnx).
For documentation on implementing custom layers for TensorRT, please check https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#extending and look at our TensorRT samples
Thank you for you reply to my question and the link to the example.
I work with a Drive PX2 and want to run a TensorRT engine with custom layers on it, according to the information you provided in Figure 2 of the following link:
Dear maxe2470,
Yes. You can write custom plugin layers and create a tensorRT engine. Please check tensorRT samples on this. But note that this model can not be integrated into Driveworks object detector sample as custom layers are not supported in DW 1.2
Thank you for the quick response. That is good information to know.
To make sure I understand, then, there is no way to utilize a TensorRT engine with custom layers in DW 1.2 even if I didn’t use the object detector sample?
If this is the case, how am I able to work with GMSL cameras and YOLOv3 on Drive PX2?
Dear maxe2470,
In this case, You can check getting the dwImageCUDA object from camera and pass CUDA buffer as input to tensorRT to perform inference and find output bounding boxes. So you need to call TensorRT API calls to load your model, build network and inferencing inside DW sample.
Do u mean that we can’t load custom layers model into object detector sample, but we cant use dwImageCUDA from gmsl as input to tensorRT to perform inference ?
Dear Allenchen,
To use DNN model with DW APIs, it has to be generated using tensorRT_optimization tool. But we have provided including plugin layer as parameter to tool in DW 1.5 onwards where as the last release for DRIVE PX2 has DW 1.2.