sampleUffSSD with custom ssd_mobilenet_v1 model

Ubuntu 18.04
CUDA 10.0.117
CUDNN 7.3.1.20
Python 3.6
Tensorflow 1.12.0+nv19.1 (official from https://developer.download.nvidia.com/compute/redist/jp/v411)
TensorRT 5.0.3.2
Jetson AGX Xavier, L4T R31.1

Hi.
I was able to convert ssd_mobilenet_v1_coco_2018_01_28 model to UFF and run it with sampleUffSSD using the next config.py

import graphsurgeon as gs
import tensorflow as tf

Input = gs.create_node("Input",
    op="Placeholder",
    dtype=tf.float32,
    shape=[1, 3, 300, 300])
PriorBox = gs.create_plugin_node(name="GridAnchor", op="GridAnchor_TRT",
    numLayers=6,
    minSize=0.2,
    maxSize=0.95,
    aspectRatios=[1.0, 2.0, 0.5, 3.0, 0.33],
    variance=[0.1,0.1,0.2,0.2],
    featureMapShapes=[19, 10, 5, 3, 2, 1])
NMS = gs.create_plugin_node(name="NMS", op="NMS_TRT",
    shareLocation=1,
    varianceEncodedInTarget=0,
    backgroundLabelId=0,
    confidenceThreshold=1e-8,
    nmsThreshold=0.6,
    topK=100,
    keepTopK=100,
    numClasses=91,
    inputOrder=[0, 2, 1],
    confSigmoid=1,
    isNormalized=1,
    scoreConverter="SIGMOID")
concat_priorbox = gs.create_node(name="concat_priorbox", op="ConcatV2", dtype=tf.float32, axis=2)
concat_box_loc = gs.create_plugin_node("concat_box_loc", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)
concat_box_conf = gs.create_plugin_node("concat_box_conf", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)

namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    # "ToFloat": Input,
    # "image_tensor": Input,
    "MultipleGridAnchorGenerator/Concatenate": concat_priorbox,
    "concat": concat_box_loc,
    "concat_1": concat_box_conf
}

namespace_remove = {
    "ToFloat",
    "image_tensor",
    "Preprocessor/map/TensorArrayStack_1/TensorArrayGatherV3"
}

def preprocess(dynamic_graph):
    # remove the unrelated or error layers
    dynamic_graph.remove(dynamic_graph.find_nodes_by_path(namespace_remove), remove_exclusive_dependencies=False)

    # Now create a new graph by collapsing namespaces
    dynamic_graph.collapse_namespaces(namespace_plugin_map)
    # Remove the outputs, so we just have a single output node (NMS).
    dynamic_graph.remove(dynamic_graph.graph_outputs, remove_exclusive_dependencies=False)

    # Remove the Squeeze to avoid "Assertion `isPlugin(layerName)' failed"
    Squeeze = dynamic_graph.find_node_inputs_by_name(dynamic_graph.graph_outputs[0], 'Squeeze')
    dynamic_graph.forward_inputs(Squeeze)

But was unable to do the same for my custom ssd_mobilenet_v1 model fine-tuned on top of ssd_mobilenet_v1_coco_2017_11_17 for my own 4 classes. Config.py used was

import graphsurgeon as gs
import tensorflow as tf

Input = gs.create_node("Input",
    op="Placeholder",
    dtype=tf.float32,
    shape=[1, 3, 300, 300])
PriorBox = gs.create_plugin_node(name="GridAnchor", op="GridAnchor_TRT",
    numLayers=6,
    minSize=0.2,
    maxSize=0.95,
    aspectRatios=[1.0, 2.0, 0.5, 3.0, 0.33],
    variance=[0.1,0.1,0.2,0.2],
    featureMapShapes=[19, 10, 5, 3, 2, 1])
NMS = gs.create_plugin_node(name="NMS", op="NMS_TRT",
    shareLocation=1,
    varianceEncodedInTarget=0,
    backgroundLabelId=0,
    confidenceThreshold=1e-8,
    nmsThreshold=0.6,
    topK=100,
    keepTopK=100,
    numClasses=5,
    inputOrder=[2, 0, 1],
    confSigmoid=1,
    isNormalized=1,
    scoreConverter="SIGMOID")
concat_priorbox = gs.create_node(name="concat_priorbox", op="ConcatV2", dtype=tf.float32, axis=2)
concat_box_loc = gs.create_plugin_node("concat_box_loc", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)
concat_box_conf = gs.create_plugin_node("concat_box_conf", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)

namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    # "ToFloat": Input,
    # "image_tensor": Input,
    "Concatenate": concat_priorbox,
    "concat_1": concat_box_loc,
    "concat": concat_box_conf
}

namespace_remove = {
    "ToFloat",
    "image_tensor",
    "Preprocessor/map/TensorArrayStack_1/TensorArrayGatherV3",
    "Preprocessor/stack_1",
    "Preprocessor/ResizeImage/stack_1"
}

def preprocess(dynamic_graph):
    # remove the unrelated or error layers
    dynamic_graph.remove(dynamic_graph.find_nodes_by_path(namespace_remove), remove_exclusive_dependencies=False)

    # Now create a new graph by collapsing namespaces
    dynamic_graph.collapse_namespaces(namespace_plugin_map)
    # Remove the outputs, so we just have a single output node (NMS).
    dynamic_graph.remove(dynamic_graph.graph_outputs, remove_exclusive_dependencies=False)

    # Remove the Squeeze to avoid "Assertion `isPlugin(layerName)' failed"
    Squeeze = dynamic_graph.find_node_inputs_by_name(dynamic_graph.graph_outputs[0], 'Squeeze')
    dynamic_graph.forward_inputs(Squeeze)

Here I changed

numClasses=5 (have 4 custom classes)
inputOrder=[2, 1, 0] (input nodes to NMS are ['concat_priorbox', 'concat_box_conf', 'concat_box_loc'])

Number of classes in sampleUffSSD was modified accordingly.

"concat": concat_box_loc,
"concat_1": concat_box_conf

was changed to

"concat_1": concat_box_loc,
"concat": concat_box_conf

because of different node names.

According to https://devtalk.nvidia.com/default/topic/1037412/tensorrt/sampleuffssd-with-newer-tensorflow-models-2018-/post/5295479/
I decreased the number of inputs from 4 to 3 by adding

"Preprocessor/stack_1",
"Preprocessor/ResizeImage/stack_1"

to namespace_remove.

Also changed

"MultipleGridAnchorGenerator/Concatenate": concat_priorbox,

to

"Concatenate": concat_priorbox,

because of different node names.

But sampleUffSSD returns with error

Begin parsing model...
End parsing model...
Begin building engine...
sample_uff_ssd: nmsPlugin.cpp:135: virtual void nvinfer1::plugin::DetectionOutput::configureWithFormat(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, nvinfer1::DataType, nvinfer1::PluginFormat, int): Assertion `numPriors * numLocClasses * 4 == inputDims[param.inputOrder[0]].d[0]' failed.
Aborted (core dumped)

Any ideas on what else should be modified?

Corresponding files can be found here:
pb - https://drive.google.com/open?id=1sq_fkR3Zi4NNuTMZSTdnwDTCP0ikVdK7,
uff - https://drive.google.com/open?id=1y8ndKp-m8t6wXhvNW9kq_quofDrXUvDR,
pbtxt - https://drive.google.com/file/d/1BuT074bc6Mdxy7xRo9WEF_9WIcU51s1f

numClasses should be 4

My network produces 4 classes. Adding background (following the sample config) gives me 5 classes.
And anyway, with numClasses=4 and OUTPUT_CLS_SIZE=4 (in sampleUffSSD.cpp) I get the same error.

Hello,

Did you find any solution to above problem? I am facing similar issues. Any help would be greatly appreciated.

Thanks.

No, unfortunately.

Hello @pkolomiets,

I have been able to convert a mobilenet V2 model, with help from another topic. Please check [url]https://devtalk.nvidia.com/default/topic/1050465/jetson-nano/how-to-write-config-py-for-converting-ssd-mobilenetv2-to-uff-format/post/5331376/?offset=7#5331770[/url]

Hope it helps!

Hi pkolomiets,
I am also trying to convert mobile_ssd_v1 from .pb to .uff.
I am getting errors as follows:
Can you please share how you got the convertion of ssd_mobilenet_v1_coco_2018_01_28 model to UFF, please.

My errors are as follows :

saran@athena:~/Documents/embassy$ python3 convert_to_uff.py …/ssd_mobilenet_v1_coco_2018_01_28/frozen_inference_graph.pb -o mobilenet_ssd.uff -O ‘num_detections’ -O ‘detection_boxes’ -O ‘detection_scores’ -O ‘detection_classes’
Loading …/ssd_mobilenet_v1_coco_2018_01_28/frozen_inference_graph.pb
NOTE: UFF has been tested with TensorFlow 1.12.0. Other versions are not guaranteed to work
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
UFF Version 0.6.3
=== Automatically deduced input nodes ===
[name: “image_tensor”
op: “Placeholder”
attr {
key: “dtype”
value {
type: DT_UINT8
}
}
attr {
key: “shape”
value {
shape {
dim {
size: -1
}
dim {
size: -1
}
dim {
size: -1
}
dim {
size: 3
}
}
}
}
]

Using output node num_detections
Using output node detection_boxes
Using output node detection_scores
Using output node detection_classes
Converting to UFF graph
Warning: No conversion function registered for layer: TensorArrayGatherV3 yet.
Converting Postprocessor/BatchMultiClassNonMaxSuppression/map/TensorArrayStack_2/TensorArrayGatherV3 as custom op: TensorArrayGatherV3
Warning: No conversion function registered for layer: Exit yet.
Converting Postprocessor/BatchMultiClassNonMaxSuppression/map/while/Exit_3 as custom op: Exit
Warning: No conversion function registered for layer: Switch yet.
Converting Postprocessor/BatchMultiClassNonMaxSuppression/map/while/Switch_3 as custom op: Switch
Warning: No conversion function registered for layer: LoopCond yet.
Converting Postprocessor/BatchMultiClassNonMaxSuppression/map/while/LoopCond as custom op: LoopCond
Warning: No conversion function registered for layer: Less yet.
Converting Postprocessor/BatchMultiClassNonMaxSuppression/map/while/Less as custom op: Less
Warning: No conversion function registered for layer: Enter yet.
Converting Postprocessor/BatchMultiClassNonMaxSuppression/map/while/Less/Enter as custom op: Enter
Warning: No conversion function registered for layer: TensorArrayGatherV3 yet.
Converting Preprocessor/map/TensorArrayStack/TensorArrayGatherV3 as custom op: TensorArrayGatherV3
Warning: No conversion function registered for layer: Exit yet.
Converting Preprocessor/map/while/Exit_1 as custom op: Exit
Warning: No conversion function registered for layer: Switch yet.
Converting Preprocessor/map/while/Switch_1 as custom op: Switch
Warning: No conversion function registered for layer: LoopCond yet.
Converting Preprocessor/map/while/LoopCond as custom op: LoopCond
Warning: No conversion function registered for layer: Less yet.
Converting Preprocessor/map/while/Less as custom op: Less
Warning: No conversion function registered for layer: Enter yet.
Converting Preprocessor/map/while/Less/Enter as custom op: Enter
Warning: No conversion function registered for layer: Cast yet.
Converting ToFloat as custom op: Cast
Warning: No conversion function registered for layer: Merge yet.
Converting Preprocessor/map/while/Merge as custom op: Merge
Warning: No conversion function registered for layer: NextIteration yet.
Converting Preprocessor/map/while/NextIteration as custom op: NextIteration
Warning: No conversion function registered for layer: Switch yet.
Converting Preprocessor/map/while/Switch as custom op: Switch
Warning: No conversion function registered for layer: Enter yet.
Converting Preprocessor/map/while/Enter as custom op: Enter
Warning: No conversion function registered for layer: Merge yet.
Converting Preprocessor/map/while/Merge_1 as custom op: Merge
Warning: No conversion function registered for layer: NextIteration yet.
Converting Preprocessor/map/while/NextIteration_1 as custom op: NextIteration
Warning: No conversion function registered for layer: TensorArrayWriteV3 yet.
Converting Preprocessor/map/while/TensorArrayWrite/TensorArrayWriteV3 as custom op: TensorArrayWriteV3
Warning: No conversion function registered for layer: ResizeBilinear yet.
Converting Preprocessor/map/while/ResizeImage/ResizeBilinear as custom op: ResizeBilinear
Warning: No conversion function registered for layer: TensorArrayReadV3 yet.
Converting Preprocessor/map/while/TensorArrayReadV3 as custom op: TensorArrayReadV3
Warning: No conversion function registered for layer: Enter yet.
Converting Preprocessor/map/while/TensorArrayReadV3/Enter_1 as custom op: Enter
Warning: No conversion function registered for layer: TensorArrayScatterV3 yet.
Converting Preprocessor/map/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3 as custom op: TensorArrayScatterV3
Warning: No conversion function registered for layer: TensorArrayV3 yet.
Converting Preprocessor/map/TensorArray as custom op: TensorArrayV3
Warning: No conversion function registered for layer: Range yet.
Converting Preprocessor/map/TensorArrayUnstack/range as custom op: Range
Warning: No conversion function registered for layer: Enter yet.
Converting Preprocessor/map/while/TensorArrayReadV3/Enter as custom op: Enter
Traceback (most recent call last):
File “convert_to_uff.py”, line 93, in
main()
File “convert_to_uff.py”, line 89, in main
debug_mode=args.debug
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/conversion_helpers.py”, line 233, in from_tensorflow_frozen_model
return from_tensorflow(graphdef, output_nodes, preprocessor, **kwargs)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/conversion_helpers.py”, line 181, in from_tensorflow
debug_mode=debug_mode)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 94, in convert_tf2uff_graph
uff_graph, input_replacements, debug_mode=debug_mode)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 79, in convert_tf2uff_node
op, name, tf_node, inputs, uff_graph, tf_nodes=tf_nodes, debug_mode=debug_mode)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 41, in convert_layer
fields = cls.parse_tf_attrs(tf_node.attr)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 222, in parse_tf_attrs
return {key: cls.parse_tf_attr_value(val) for key, val in attrs.items() if val is not None and val.WhichOneof(‘value’) is not None}
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 222, in
return {key: cls.parse_tf_attr_value(val) for key, val in attrs.items() if val is not None and val.WhichOneof(‘value’) is not None}
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 218, in parse_tf_attr_value
return cls.convert_tf2uff_field(code, val)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 190, in convert_tf2uff_field
return TensorFlowToUFFConverter.convert_tf2numpy_dtype(val)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 103, in convert_tf2numpy_dtype
return tf.as_dtype(dtype).as_numpy_dtype
File “/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py”, line 129, in as_numpy_dtype
return _TF_TO_NP[self._type_enum]
KeyError: 20

I Got the conversion running using command below :
convert_to_uff frozen_inference_graph.pb -O NMS -p config.py

After this conversion, i took .uff copied as …/data/ssd/sample_ssd_relu6.uff and tried running

ign@ignnano:/usr/src/tensorrt/bin$ ./sample_uff_ssd
…/data/ssd/sample_ssd_relu6.uff
Begin parsing model…
End parsing model…
Begin building engine…
Killed

Also tried debugging and found that it is getting killed at below location:
157 ICudaEngine* engine = builder->buildCudaEngine(*network);

Any ideas on how i can proceed.

I finally got the sample code working with inception_ssd model.
But with ssd_mobilenet_v1_coco_2018_01_28 model converted to uff, it gives error as :

ign@ignnano:/usr/src/tensorrt/bin$ ./sample_uff_ssd
…/data/ssd/sample_ssd_relu6.uff
Begin parsing model…
ERROR: UFFParser: Graph error: Cycle graph detected
ERROR: sample_uff_ssd: Fail to parse
sample_uff_ssd: sampleUffSSD.cpp:535: int main(int, char**): Assertion `tmpEngine != nullptr’ failed.
Aborted

Nvidia team/pkolomiets, can you please help.
Attached .uff file.

frozen_inference_graph.zip (24.2 MB)

Hello Nvidia team,
Can you please respond.

Hello Vinaybk, we are triaging. will keep you updated.

Hi, vinaybk.
Please try this config file for ssd_mobilenet_v1_coco_2018_01_28.

import graphsurgeon as gs
import tensorflow as tf

Input = gs.create_node("Input",
    op="Placeholder",
    dtype=tf.float32,
    shape=[1, 3, 300, 300])
PriorBox = gs.create_plugin_node(name="GridAnchor", op="GridAnchor_TRT",
    numLayers=6,
    minSize=0.2,
    maxSize=0.95,
    aspectRatios=[1.0, 2.0, 0.5, 3.0, 0.33],
    variance=[0.1,0.1,0.2,0.2],
    featureMapShapes=[19, 10, 5, 3, 2, 1])
NMS = gs.create_plugin_node(name="NMS", op="NMS_TRT",
    shareLocation=1,
    varianceEncodedInTarget=0,
    backgroundLabelId=0,
    confidenceThreshold=1e-8,
    nmsThreshold=0.6,
    topK=100,
    keepTopK=100,
    numClasses=91,
    inputOrder=[1, 2, 0],
    confSigmoid=1,
    isNormalized=1,
    scoreConverter="SIGMOID")
concat_priorbox = gs.create_node(name="concat_priorbox", op="ConcatV2", dtype=tf.float32, axis=2)
concat_box_loc = gs.create_plugin_node("concat_box_loc", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)
concat_box_conf = gs.create_plugin_node("concat_box_conf", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)

namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    "ToFloat": Input,
    "image_tensor": Input,
    "Concatenate": concat_priorbox,
    "Squeeze": concat_box_loc,
    "concat_1": concat_box_conf
}

def preprocess(dynamic_graph):
    # Now create a new graph by collapsing namespaces
    dynamic_graph.collapse_namespaces(namespace_plugin_map)
    # Remove the outputs, so we just have a single output node (NMS).
    dynamic_graph.remove(dynamic_graph.graph_outputs, remove_exclusive_dependencies=False)
    # Disconnect the Input node from NMS, as it expects to have only 3 inputs.
    dynamic_graph.find_nodes_by_op("NMS_TRT")[0].input.remove("Input")

thanks pkolomiets.
Used your above config.py file and got error :

python3 convert_to_uff.py …/ssd_mobilenet_v1_coco_2018_01_28/frozen_inference_graph.pb -O NMS -p …/config_ssdmobilenet_v1.py
Loading …/ssd_mobilenet_v1_coco_2018_01_28/frozen_inference_graph.pb
NOTE: UFF has been tested with TensorFlow 1.12.0. Other versions are not guaranteed to work
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
WARNING: To create TensorRT plugin nodes, please use the create_plugin_node function instead.
WARNING: To create TensorRT plugin nodes, please use the create_plugin_node function instead.
UFF Version 0.6.3
=== Automatically deduced input nodes ===
[name: “Input”
op: “Placeholder”
input: “image_tensor:0”
attr {
key: “dtype”
value {
type: DT_FLOAT
}
}
attr {
key: “shape”
value {
shape {
dim {
size: 1
}
dim {
size: 3
}
dim {
size: 300
}
dim {
size: 300
}
}
}
}
]

Using output node NMS
Converting to UFF graph
Warning: No conversion function registered for layer: NMS_TRT yet.
Converting NMS as custom op: NMS_TRT
Warning: No conversion function registered for layer: FlattenConcat_TRT yet.
Converting concat_box_conf as custom op: FlattenConcat_TRT
Traceback (most recent call last):
File “convert_to_uff.py”, line 93, in
main()
File “convert_to_uff.py”, line 89, in main
debug_mode=args.debug
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/conversion_helpers.py”, line 233, in from_tensorflow_frozen_model
return from_tensorflow(graphdef, output_nodes, preprocessor, **kwargs)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/conversion_helpers.py”, line 181, in from_tensorflow
debug_mode=debug_mode)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 94, in convert_tf2uff_graph
uff_graph, input_replacements, debug_mode=debug_mode)
File “/usr/local/lib/python3.5/site-packages/uff/converters/tensorflow/converter.py”, line 72, in convert_tf2uff_node
inp_node = tf_nodes[inp_name]
KeyError: ‘image_tensor’

The old config that you had put at beginning of this topic, works for uff conversion, but fails while running sampleUff code.

vinaybk,
Successful conversion to UFF doesn’t mean that UFF is correct file for inference.
You need to look through the frozen file’s layer names in order to find correct config.py options.
Try using -l, -L, -t options with convert_to_uff script to see layer names and their order.
I’m not sure why my last config doesn’t work in your case. From version to version of TF, TRT there are changes changing the result. My current config is TF 1.13.1, TRT 5.1.2, CUDA 10.0.
The output of my running of convert_to_uff doesn’t have this line

input: "image_tensor:0"

So, probably you can try using “image_tensor:0” instead of “image_tensor” in config.py.

For example, attached is my pbtxt file created by using -t option.
frozen_inference_graph.zip (7.2 KB)

Uff conversion went through by changing image_tensor:0. But when i put -l option i see one node reduced . i.esupposed to have 350 nodes and it produced 349 nodes.
Then to try out this .uff, i took this to nano and tried using sample_uff_ssd, and it crashed.

I am also using tensorrt 5.1.2 and cuda 10.0, tensorflow 1.13.

Can you please share your .uff ?

My UFF file.
[url]https://drive.google.com/file/d/1S7BSBBlVnK7Drc52kaPCNcdp9jOiYIgA/view[/url]

Thanks pkolomiets.
This uff that you shared works with sample code.
But my uff does not seem to be , although i am using same ssd_mobilenet_v1_coco_2018_01_28.

The error that i am getting with my uff is :

ign@ignnano:/usr/src/tensorrt/bin$ sudo ./sample_uff_ssd
[sudo] password for ign:
…/data/ssd/sample_ssd_relu6.uff
Begin parsing model…
End parsing model…
Begin building engine…
sample_uff_ssd: nmsPlugin.cpp:54: virtual nvinfer1::Dims nvinfer1::plugin::DetectionOutput::getOutputDimensions(int, const nvinfer1::Dims*, int): Assertion `nbInputDims == 3’ failed.
Aborted

Well, can just recommend to look carefully through layer input\output names and the order of output layers.

sure

+1