sampleUffSSD with newer tensorflow models (2018)

Hello I’ve been trying to convert the code from sampleUffSSD to use newer tensorflow models (>=1.8) instead of the sample coco file you reference.

The problem is that when i convert with the currect sample config.py

import graphsurgeon as gs
import tensorflow as tf

Input = gs.create_node("Input",
    op="Placeholder",
    dtype=tf.float32,
    shape=[1, 3, 300, 300])
PriorBox = gs.create_node("PriorBox",
    numLayers=6,
    minScale=0.2,
    maxScale=0.95,
    aspectRatios=[1.0, 2.0, 0.5, 3.0, 0.33],
    layerVariances=[0.1,0.1,0.2,0.2],
    featureMapShapes=[19, 10, 5, 3, 2, 1])
NMS = gs.create_node("NMS",
    scoreThreshold=1e-8,
    iouThreshold=0.6,
    maxDetectionsPerClass=100,
    maxTotalDetections=100,
    numClasses=91,
    scoreConverter="SIGMOID")
concat_priorbox = gs.create_node("concat_priorbox", dtype=tf.float32, axis=2)
concat_box_loc = gs.create_node("concat_box_loc")
concat_box_conf = gs.create_node("concat_box_conf")

namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    "ToFloat": Input,
    "image_tensor": Input,
    "MultipleGridAnchorGenerator/Concatenate": concat_priorbox,
    "concat": concat_box_loc,
    "concat_1": concat_box_conf
}

def preprocess(dynamic_graph):
    # Now create a new graph by collapsing namespaces
    dynamic_graph.collapse_namespaces(namespace_plugin_map)
    # Remove the outputs, so we just have a single output node (NMS).
    dynamic_graph.remove(dynamic_graph.graph_outputs, remove_exclusive_dependencies=False)

i get an NMS node with different number of inputs

Old tf model
Loading ssd_inception_v2_coco_2017_frozen_inference_graph.pb
Using output node NMS
Converting to UFF graph
Warning: No conversion function registered for layer: NMS yet.
Converting as custom op NMS NMS
name: "NMS"
op: "NMS"
input: "concat_box_loc"
input: "concat_priorbox"
input: "concat_box_conf"

New tf model
Loading ssd_inception_v2_coco_2018_01_28_frozen_inference_graph.pb
Using output node NMS
Converting to UFF graph
Warning: No conversion function registered for layer: NMS yet.
Converting as custom op NMS NMS
name: "NMS"
op: "NMS"
input: "Input"
input: "Squeeze"
input: "concat_priorbox"
input: "concat_box_conf"

so when i try to load the converted uff file i get an assertion complaining about the no inputs

cp conversion/ssd_inception_v2_coco_2018_01_28_frozen_inference_graph.pb.uff data/ssd/sample_ssd.uff && ./bin/sample_uff_ssd
data/ssd/sample_ssd.uff
Begin parsing model...
End parsing model...
Begin building engine...
sample_uff_ssd: NvPluginSSD.cu:713: virtual nvinfer1::Dims nvinfer1::plugin::DetectionOutput::getOutputDimensions(int, const nvinfer1::Dims*, int): Assertion `nbInputDims == 3' failed.
Aborted (core dumped)

This is a link of the nodes of the newer model https://gist.github.com/NikolasMarkou/48553938699c8e9b8d903cf0e46870ac

List of nodes is generated with the following snippet

import argparse 
import tensorflow as tf

def load_graph(frozen_graph_filename):
	# We load the protobuf file from the disk and parse it to retrieve the 
	# unserialized graph_def
	with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
		graph_def = tf.GraphDef()
		graph_def.ParseFromString(f.read())

	# Then, we import the graph_def into a new Graph and returns it 
	with tf.Graph().as_default() as graph:
		# The name var will prefix every op/nodes in your graph
		# Since we load everything in a new graph, this is not needed
		tf.import_graph_def(graph_def)
	
	return graph

if __name__ == '__main__':
	# Let's allow the user to pass the filename as an argument
	parser = argparse.ArgumentParser()
	parser.add_argument("--model", default="frozen_inference_graph.pb", type=str, help="Frozen model file to import")
	args = parser.parse_args()
	
	# We use our "load_graph" function
	graph = load_graph(args.model)
	
	counter = 0;
	for n in graph.as_graph_def().node:
		print('%d | %s' % (counter, n.name))
		counter += 1

Why is this hidden ?

I meet the same problem. Could you tell me how to solve it. Thanks!

You must make sure the ‘NMS’ node only have 3 inputs. The ‘Input’ node links to ‘NMS’ may because the ‘Preprocessor’ node in PB file frozeing from tensorflow is linked to ‘Postprocessor’ node, you can delet the link with graphsurgeon and ‘NMS’ has only 3 input, Or you can add

namespace_remove = {
“Preprocessor/stack_1”,
“Preprocessor/ResizeImage/stack_1”,
}

befor “def preprocess(dynamic_graph):”

and add:
dynamic_graph.remove(dynamic_graph.find_nodes_by_path(namespace_remove), remove_exclusive_dependencies=False)
in “def preprocess(dynamic_graph):” at first line

this will help.

yeah! That’s right. I solved it by replacing the original config.py code by the following:

namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    # "ToFloat": Input,
    # "image_tensor": Input,
    "MultipleGridAnchorGenerator/Concatenate": concat_priorbox,
    #"Concatenate/concat": concat_priorbox,
    "concat": concat_box_loc,
    "concat_1": concat_box_conf,
}

namespace_remove = {
    "ToFloat",
    "image_tensor",
    "Preprocessor/map/TensorArrayStack_1/TensorArrayGatherV3",
}

def preprocess(dynamic_graph):
    # remove the unrelated or error layers
    dynamic_graph.remove(dynamic_graph.find_nodes_by_path(namespace_remove), remove_exclusive_dependencies=False)

    # Now create a new graph by collapsing namespaces
    dynamic_graph.collapse_namespaces(namespace_plugin_map)
    # Remove the outputs, so we just have a single output node (NMS).
    dynamic_graph.remove(dynamic_graph.graph_outputs, remove_exclusive_dependencies=False)

    # Remove the Squeeze to avoid "Assertion `isPlugin(layerName)' failed"
    Squeeze = dynamic_graph.find_node_inputs_by_name(dynamic_graph.graph_outputs[0], 'Squeeze')
    dynamic_graph.forward_inputs(Squeeze)

This worked in both 2017 and 2018 model.

Thanks a lot! houhongyi.

I met another problem! Dear houhongyi, Can you take a look for me?

I change the fixed_shape_resizer of ssd_indeption_v2 from 300x300 into 600x600 and trained my dataset. When I use model to detected image with the size 1280x720 in tensorrt, it’s return the following result:

Begin parsing model…
End parsing model…
Begin building engine…
End building engine…
Num batches 2
Data Size 5529600
*** deserializing
input_c: 3 input_h: 1280 input_w: 720
Cuda failure: 11Aborted (core dumped)

I have changed the INPUT_H and INPUT_W value in BatBatchStreamPPM.h file. If I detect the image smaller than 600x600, there is no error. Does this mean that the size of the picture I use should not exceed 600x600?

The version of tensorrt I use is TensorRT-5.0.0.10.

Can you understand Chinese?if yes I am willing to discuss in Chinese。

把你的config.py发上来。 Please upload your config.py.
congfig.py中有定义它输入的尺寸。 There is a param to define the input size in config.py
我估计你定义的尺寸有问题。 There maybe a wrong define in your config.py.

config.py:

import graphsurgeon as gs
import tensorflow as tf

Input = gs.create_node("Input",
    op="Placeholder",
    dtype=tf.float32,
    shape=[1, 3, 600, 600])
PriorBox = gs.create_plugin_node(name="GridAnchor", op="GridAnchor_TRT",
    numLayers=6,
    minSize=0.2,
    maxSize=0.95,
    aspectRatios=[1.0, 2.0, 0.5, 3.0, 0.33],
    variance=[0.1,0.1,0.2,0.2],
    #featureMapShapes=[19, 10, 5, 3, 2, 1])
    featureMapShapes=[38, 19, 10, 5, 3, 2])
NMS = gs.create_plugin_node(name="NMS", op="NMS_TRT",
    shareLocation=1,
    varianceEncodedInTarget=0,
    backgroundLabelId=0,
    confidenceThreshold=1e-8,
    nmsThreshold=0.6,
    topK=100,
    keepTopK=100,
    numClasses=5,
    inputOrder=[0, 2, 1],
    confSigmoid=1,
    isNormalized=1,
    scoreConverter="SIGMOID")

concat_priorbox = gs.create_node(name="concat_priorbox", op="ConcatV2", dtype=tf.float32, axis=2)
concat_box_loc = gs.create_plugin_node("concat_box_loc", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)
concat_box_conf = gs.create_plugin_node("concat_box_conf", op="FlattenConcat_TRT", dtype=tf.float32, axis=1, ignoreBatch=0)

namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    # "ToFloat": Input,
    # "image_tensor": Input,
    #"MultipleGridAnchorGenerator/Concatenate": concat_priorbox,
    "Concatenate/concat": concat_priorbox,
    "concat": concat_box_loc,
    "concat_1": concat_box_conf,
}

namespace_remove = {
    "ToFloat",
    "image_tensor",
    "Preprocessor/map/TensorArrayStack_1/TensorArrayGatherV3",
}

def preprocess(dynamic_graph):
    # remove the unrelated or error layers
    dynamic_graph.remove(dynamic_graph.find_nodes_by_path(namespace_remove), remove_exclusive_dependencies=False)

    # Now create a new graph by collapsing namespaces
    dynamic_graph.collapse_namespaces(namespace_plugin_map)
    # Remove the outputs, so we just have a single output node (NMS).
    dynamic_graph.remove(dynamic_graph.graph_outputs, remove_exclusive_dependencies=False)

    # Remove the Squeeze to avoid "Assertion `isPlugin(layerName)' failed"
    Squeeze = dynamic_graph.find_node_inputs_by_name(dynamic_graph.graph_outputs[0], 'Squeeze')
    dynamic_graph.forward_inputs(Squeeze)


'''
namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    "ToFloat": Input,
    "image_tensor": Input,
    "Concatenate/concat": concat_priorbox,
    #"concat": concat_box_loc,
    "concat": concat_box_loc,
    "concat_1": concat_box_conf
}

namespace_remove = {
    "image_tensor",
}

def preprocess(dynamic_graph):
    # Now create a new graph by collapsing namespaces
    dynamic_graph.collapse_namespaces(namespace_plugin_map)
    # Remove the outputs, so we just have a single output node (NMS).
    dynamic_graph.remove(dynamic_graph.graph_outputs, remove_exclusive_dependencies=False)
    # Remove the Squeeze to avoid "Assertion `isPlugin(layerName)' failed"
    Squeeze = dynamic_graph.find_node_inputs_by_name(dynamic_graph.graph_outputs[0], 'Squeeze')
    dynamic_graph.forward_inputs(Squeeze)
    # Remove the Input to avoid "Assertion `isPlugin(layerName)' failed"
    #Input = dynamic_graph.find_node_inputs_by_name(dynamic_graph.graph_outputs[3], 'Input')
    #dynamic_graph.forward_inputs(Input)
'''

pipline.config:

model {
  ssd {
    num_classes: 4
    image_resizer {
      fixed_shape_resizer {
        height: 600
        width: 600
      }
    }
    feature_extractor {
      type: "ssd_inception_v2"
      depth_multiplier: 1.0
      min_depth: 16
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 3.99999989895e-05
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.0299999993294
          }
        }
        activation: RELU_6
        batch_norm {
          decay: 0.999700009823
          center: true
          scale: true
          epsilon: 0.0010000000475
          train: true
        }
      }
      override_base_feature_extractor_hyperparams: true
    }
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    box_predictor {
      convolutional_box_predictor {
        conv_hyperparams {
          regularizer {
            l2_regularizer {
              weight: 3.99999989895e-05
            }
          }
          initializer {
            truncated_normal_initializer {
              mean: 0.0
              stddev: 0.0299999993294
            }
          }
          activation: RELU_6
        }
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.800000011921
        kernel_size: 3
        box_code_size: 4
        apply_sigmoid_to_scores: false
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.20000000298
        max_scale: 0.949999988079
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.333299994469
        reduce_boxes_in_lowest_layer: true
      }
    }
    post_processing {
      batch_non_max_suppression {
        score_threshold: 9.99999993923e-09
        iou_threshold: 0.600000023842
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
    normalize_loss_by_num_matches: true
    loss {
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_loss {
        weighted_sigmoid {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.990000009537
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
  }
}
train_config {
  batch_size: 6
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
  optimizer {
    rms_prop_optimizer {
      learning_rate {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.00400000018999
          decay_steps: 800720
          decay_factor: 0.949999988079
        }
      }
      momentum_optimizer_value: 0.899999976158
      decay: 0.899999976158
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "ssd_inception_v2_coco_2018_01_28/model.ckpt"
  from_detection_checkpoint: true
  num_steps: 20000
}
train_input_reader {
  label_map_path: "training/ssd_inception_v2_whsyxt/whsyxt_label_map.pbtxt"
  tf_record_input_reader {
    input_path: "data/whsyxt_train.tfrecord"
  }
}
eval_config {
  num_examples: 8000
  max_evals: 10
  use_moving_averages: false
}
eval_input_reader {
  label_map_path: "training/ssd_inception_v2_whsyxt/whsyxt_label_map.pbtxt"
  shuffle: false
  num_readers: 1
  tf_record_input_reader {
    input_path: "data/whsyxt_validation.tfrecord"
  }
}

I am chinese! 我是中国人!
houhongyi, 可以给我讲下怎么回事吗。
我看程序,是inference时出的错,就是下面这条语句:

// DMA the input to the GPU,  execute the batch asynchronously, and DMA it back:
CHECK(cudaMemcpyAsync(buffers[inputIndex], inputData, batchSize * INPUT_C * INPUT_H * INPUT_W * sizeof(float), cudaMemcpyHostToDevice, stream));

这句话没有问题。CHECK()是用来检查括号内的内容是否执行成功。
cudaMemcpyAsync()是用异步执行内存拷贝。
传入参数分别为
buffers[inputIndex] 拷贝目标地址 buffers里存的是已经分配好的GPU内存地址
inputData拷贝源地址 就是整理好的图像信息
batchSize * INPUT_C * INPUT_H * INPUT_W * sizeof(float) 拷贝数量
cudaMemcpyHostToDevice 拷贝方向 内存到显存
stream拷贝用的流

这句话能有什么问题吗?

你好,想跟你请教一下这个tensorrt的加速,方便私聊嘛?

Hi,

I’m also trying to do the same for ssd-inceptionv2 for custom trained model(six classes). However I’m getting errors like below while creating the tesnsorrt engine from the UFF. Could you please share your thought or approach to handle such errors.

[TensorRT] ERROR: UffParser: Validator error: concat_box_loc: Unsupported operation _FlattenConcat_TRT

It works after setting the path of the plugin .

Hi, I’m trying to convert a tf model to TRT.

I’m able to convert the default SSD_mobilenet_v2 model and SSD_mobilenet_v2 model trained on custom data without a problem. This is true when the input size is fixed to 300x300, but if I try to change the input size to a different size (i.e. 500x500), the conversion to .uff fails with following error:

python3.6: nmsPlugin.cpp:139: virtual void nvinfer1::plugin::DetectionOutput::configureWithFormat(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, nvinfer1::DataType, nvinfer1::PluginFormat, int): Assertion `numPriors * numLocClasses * 4 == inputDims[param.inputOrder[0]].d[0]’ failed.

Can you give me instructions how to perform model conversion if the input size isn’t the default 300x300? baiyisheng, did you succeed to convert your model?

Your solution worked for original sdd inception model, but after training my custom model on top it, I get “UffParser: Validator error: Cast: Unsupported operation _Cast” error.

@ivan.ralasic: You need to adjust the featureMapShapes=[38, 19, 10, 5, 3, 2]) to your model. This will give you the values:

import tensorflow as tf
from object_detection.models.ssd_mobilenet_v2_feature_extractor_test import SsdMobilenetV2FeatureExtractorTest

image_height = 500
image_width = 500

feature_extractor = SsdMobilenetV2FeatureExtractorTest()._create_feature_extractor(
    depth_multiplier=1,
    pad_to_multiple=1,
)
image_batch_tensor = tf.zeros([1, image_height, image_width, 1])

print([tuple(feature_map.get_shape().as_list()[1:3]) for feature_map in
       feature_extractor.extract_features(image_batch_tensor)])
1 Like

Change your namespace plugin map to this

namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    "Cast": Input,
    "ToFloat": Input,
    "image_tensor": Input,
    "MultipleGridAnchorGenerator/Concatenate": concat_priorbox,
    "MultipleGridAnchorGenerator/Identity": concat_priorbox,
    "Concatenate/concat": concat_priorbox,
    "concat": concat_box_loc,
    "concat_1": concat_box_conf
}

in the config.py