hi, I use the transformer model in OpenSeq2Seq to run the built-in example, en-de machine translation.
I follow the tutorial of OpenSeq2Seq except using transformer model instead of nmt.
I can train the model and do inference with tensorflow 13.1, cuda10.1. But I can’t do inference with tensorrt 5 of tensorflow13.1.
The main error is:
ValueError: Input 0 of node ForwardPass/transformer_decoder/decode/while/layer_0/self_attention/self_attention/q/Tensordot/ReadVariableOp/Enter was passed float from ForwardPass/transformer_decoder/layer_0/self_attention/self_attention/q/kernel:0 incompatible with expected resource.
And the whole error log is :
Traceback (most recent call last):
File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/run.py", line 101, in <module>
main()
File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/run.py", line 81, in main
args, base_config, config_module, base_model, hvd, checkpoint)
File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/open_seq2seq/utils/utils.py", line 790, in create_model
model.compile(checkpoint=checkpoint)
File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/open_seq2seq/models/model.py", line 445, in compile
checkpoint=checkpoint
File "/home/xxx/pycharm_proj/OpenSeq2Seq_raw/open_seq2seq/models/model.py", line 645, in build_trt_forward_pass_graph
maximum_cached_engines=trt_params["trt_maximum_cached_engines"]
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/tensorrt/python/trt_convert.py", line 333, in create_inference_graph
importer.import_graph_def(input_graph_def, name="")
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 430, in import_graph_def
raise ValueError(str(e))
ValueError: Input 0 of node ForwardPass/transformer_decoder/decode/while/layer_0/self_attention/self_attention/q/Tensordot/ReadVariableOp/Enter was passed float from ForwardPass/transformer_decoder/layer_0/self_attention/self_attention/q/kernel:0 incompatible with expected resource.
Process finished with exit code 1
SOLUTION: I find a solution from the network by modifying some node.op, such as RefSwitch, AssignSub, AssignAdd, to Switch, Sub, Add respectively. But it is useless. The modification happens as follows:
# Restore checkpoint here because we have to freeze the graph
tf_saver = tf.train.Saver()
tf_saver.restore(save_path=checkpoint, sess=tf_sess)
# I ALSO TRY TO ADD THE MODIFICATION HERE. BUT USELESS.#
frozen_graph = tf.graph_util.convert_variables_to_constants(
tf_sess,
tf_sess.graph_def,
output_node_names=output_node_names
)
num_nodes = len(frozen_graph.node)
print('Converting graph using TensorFlow-TensorRT...')
# THIS IS THE MODIFICATION
# gd = tf_sess.graph.as_graph_def()
for node in frozen_graph.node:
if node.op == 'RefSwitch':
node.op = 'Switch'
for index in xrange(len(node.input)):
if 'moving_' in node.input[index]:
node.input[index] = node.input[index] + '/read'
elif node.op == 'AssignSub':
node.op = 'Sub'
if 'use_locking' in node.attr: del node.attr['use_locking']
elif node.op == 'AssignAdd':
node.op = 'Add'
if 'use_locking' in node.attr: del node.attr['use_locking']
# ERROR OCCURS IN THE FOLLOWING FUNCTION.
frozen_graph = trt.create_inference_graph(
input_graph_def=frozen_graph,
outputs=output_node_names,
max_batch_size=trt_params["batch_size_per_gpu"],
max_workspace_size_bytes=trt_params["trt_max_workspace_size_bytes"],
precision_mode=trt_params["trt_precision_mode"],
minimum_segment_size=trt_params["trt_minimum_segment_size"],
is_dynamic_op=trt_params["trt_is_dynamic_op"],
maximum_cached_engines=trt_params["trt_maximum_cached_engines"]
)
I will be very grateful for any help.