I have trained an SSD model using TLT and when I attempt to use it for inferencing I get the error shown below. It appears that the cuDNN failed to initialize but I can’t make out why.
Can anyone comment as to what has caused this error and/or how to get past it? Thanks in advance for any insight or suggestions.
# tlt-infer ssd -i test_images/handgun_shooter -o test_images/output -e specs/ssd_resnet10_weapons_train.txt -m output/ssd_20191104_unpruned/weights/ssd_resnet10_epoch_225.tlt -k ${NGC_API_KEY}
Using TensorFlow backend.
2019-11-05 16:39:25,879 [INFO] iva.ssd.scripts.inference: Loading experiment spec at specs/ssd_resnet10_weapons_train.txt.
2019-11-05 16:39:25,881 [INFO] /usr/local/lib/python2.7/dist-packages/iva/ssd/utils/spec_loader.pyc: Merging specification from specs/ssd_resnet10_weapons_train.txt
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-11-05 16:39:26,260 [WARNING] tensorflow: From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-11-05 16:39:27.240404: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-05 16:39:27.347941: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-11-05 16:39:27.349339: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x946f980 executing computations on platform CUDA. Devices:
2019-11-05 16:39:27.349356: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): GeForce GTX 1660 Ti, Compute Capability 7.5
2019-11-05 16:39:27.374523: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000000000 Hz
2019-11-05 16:39:27.375076: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x94d86a0 executing computations on platform Host. Devices:
2019-11-05 16:39:27.375114: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
2019-11-05 16:39:27.375235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce GTX 1660 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:01:00.0
totalMemory: 5.77GiB freeMemory: 4.88GiB
2019-11-05 16:39:27.375249: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-11-05 16:39:27.376397: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-05 16:39:27.376412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-11-05 16:39:27.376420: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-11-05 16:39:27.376663: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4699 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
/usr/local/lib/python2.7/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
warnings.warn('No training configuration found in save file: '
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
Input (InputLayer) (None, 3, 768, 1024) 0
__________________________________________________________________________________________________
conv1 (Conv2D) (None, 64, 384, 512) 9472 Input[0][0]
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization) (None, 64, 384, 512) 256 conv1[0][0]
__________________________________________________________________________________________________
activation_19 (Activation) (None, 64, 384, 512) 0 bn_conv1[0][0]
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D) (None, 64, 192, 256) 36928 activation_19[0][0]
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 192, 256) 256 block_1a_conv_1[0][0]
__________________________________________________________________________________________________
activation_20 (Activation) (None, 64, 192, 256) 0 block_1a_bn_1[0][0]
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D) (None, 64, 192, 256) 36928 activation_20[0][0]
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 192, 256) 4160 activation_19[0][0]
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 192, 256) 256 block_1a_conv_2[0][0]
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 192, 256) 256 block_1a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_9 (Add) (None, 64, 192, 256) 0 block_1a_bn_2[0][0]
block_1a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_21 (Activation) (None, 64, 192, 256) 0 add_9[0][0]
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D) (None, 128, 96, 128) 73856 activation_21[0][0]
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 96, 128) 512 block_2a_conv_1[0][0]
__________________________________________________________________________________________________
activation_22 (Activation) (None, 128, 96, 128) 0 block_2a_bn_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D) (None, 128, 96, 128) 147584 activation_22[0][0]
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 96, 128) 8320 activation_21[0][0]
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 96, 128) 512 block_2a_conv_2[0][0]
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 96, 128) 512 block_2a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_10 (Add) (None, 128, 96, 128) 0 block_2a_bn_2[0][0]
block_2a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_23 (Activation) (None, 128, 96, 128) 0 add_10[0][0]
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D) (None, 256, 48, 64) 295168 activation_23[0][0]
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 48, 64) 1024 block_3a_conv_1[0][0]
__________________________________________________________________________________________________
activation_24 (Activation) (None, 256, 48, 64) 0 block_3a_bn_1[0][0]
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D) (None, 256, 48, 64) 590080 activation_24[0][0]
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 48, 64) 33024 activation_23[0][0]
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 48, 64) 1024 block_3a_conv_2[0][0]
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 48, 64) 1024 block_3a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_11 (Add) (None, 256, 48, 64) 0 block_3a_bn_2[0][0]
block_3a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_25 (Activation) (None, 256, 48, 64) 0 add_11[0][0]
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D) (None, 512, 48, 64) 1180160 activation_25[0][0]
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (None, 512, 48, 64) 2048 block_4a_conv_1[0][0]
__________________________________________________________________________________________________
activation_26 (Activation) (None, 512, 48, 64) 0 block_4a_bn_1[0][0]
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D) (None, 512, 48, 64) 2359808 activation_26[0][0]
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 48, 64) 131584 activation_25[0][0]
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (None, 512, 48, 64) 2048 block_4a_conv_2[0][0]
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (None, 512, 48, 64) 2048 block_4a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_12 (Add) (None, 512, 48, 64) 0 block_4a_bn_2[0][0]
block_4a_bn_shortcut[0][0]
__________________________________________________________________________________________________
activation_27 (Activation) (None, 512, 48, 64) 0 add_12[0][0]
__________________________________________________________________________________________________
expand_conv1 (Conv2D) (None, 1024, 48, 64) 4719616 activation_27[0][0]
__________________________________________________________________________________________________
expand1_relu (ReLU) (None, 1024, 48, 64) 0 expand_conv1[0][0]
__________________________________________________________________________________________________
expand_conv2 (Conv2D) (None, 1024, 48, 64) 1049600 expand1_relu[0][0]
__________________________________________________________________________________________________
expand2_relu (ReLU) (None, 1024, 48, 64) 0 expand_conv2[0][0]
__________________________________________________________________________________________________
additional_map0_0 (Conv2D) (None, 256, 48, 64) 262400 expand2_relu[0][0]
__________________________________________________________________________________________________
additional_map0_0_relu (ReLU) (None, 256, 48, 64) 0 additional_map0_0[0][0]
__________________________________________________________________________________________________
additional_map0_1 (Conv2D) (None, 512, 24, 32) 1180160 additional_map0_0_relu[0][0]
__________________________________________________________________________________________________
additional_map0_1_relu (ReLU) (None, 512, 24, 32) 0 additional_map0_1[0][0]
__________________________________________________________________________________________________
additional_map1_0 (Conv2D) (None, 128, 24, 32) 65664 additional_map0_1_relu[0][0]
__________________________________________________________________________________________________
additional_map1_0_relu (ReLU) (None, 128, 24, 32) 0 additional_map1_0[0][0]
__________________________________________________________________________________________________
additional_map1_1 (Conv2D) (None, 256, 12, 16) 295168 additional_map1_0_relu[0][0]
__________________________________________________________________________________________________
additional_map1_1_relu (ReLU) (None, 256, 12, 16) 0 additional_map1_1[0][0]
__________________________________________________________________________________________________
additional_map2_0 (Conv2D) (None, 128, 12, 16) 32896 additional_map1_1_relu[0][0]
__________________________________________________________________________________________________
additional_map2_0_relu (ReLU) (None, 128, 12, 16) 0 additional_map2_0[0][0]
__________________________________________________________________________________________________
additional_map2_1 (Conv2D) (None, 256, 6, 8) 295168 additional_map2_0_relu[0][0]
__________________________________________________________________________________________________
additional_map2_1_relu (ReLU) (None, 256, 6, 8) 0 additional_map2_1[0][0]
__________________________________________________________________________________________________
additional_map3_0 (Conv2D) (None, 128, 6, 8) 32896 additional_map2_1_relu[0][0]
__________________________________________________________________________________________________
additional_map3_0_relu (ReLU) (None, 128, 6, 8) 0 additional_map3_0[0][0]
__________________________________________________________________________________________________
additional_map3_1 (Conv2D) (None, 256, 3, 4) 295168 additional_map3_0_relu[0][0]
__________________________________________________________________________________________________
additional_map3_1_relu (ReLU) (None, 256, 3, 4) 0 additional_map3_1[0][0]
__________________________________________________________________________________________________
ssd_conf_0 (Conv2D) (None, 12, 96, 128) 13836 activation_23[0][0]
__________________________________________________________________________________________________
ssd_conf_1 (Conv2D) (None, 12, 48, 64) 55308 activation_27[0][0]
__________________________________________________________________________________________________
ssd_conf_2 (Conv2D) (None, 12, 24, 32) 55308 additional_map0_1_relu[0][0]
__________________________________________________________________________________________________
ssd_conf_3 (Conv2D) (None, 12, 12, 16) 27660 additional_map1_1_relu[0][0]
__________________________________________________________________________________________________
ssd_conf_4 (Conv2D) (None, 12, 6, 8) 27660 additional_map2_1_relu[0][0]
__________________________________________________________________________________________________
ssd_conf_5 (Conv2D) (None, 12, 3, 4) 27660 additional_map3_1_relu[0][0]
__________________________________________________________________________________________________
permute_25 (Permute) (None, 96, 128, 12) 0 ssd_conf_0[0][0]
__________________________________________________________________________________________________
permute_27 (Permute) (None, 48, 64, 12) 0 ssd_conf_1[0][0]
__________________________________________________________________________________________________
permute_29 (Permute) (None, 24, 32, 12) 0 ssd_conf_2[0][0]
__________________________________________________________________________________________________
permute_31 (Permute) (None, 12, 16, 12) 0 ssd_conf_3[0][0]
__________________________________________________________________________________________________
permute_33 (Permute) (None, 6, 8, 12) 0 ssd_conf_4[0][0]
__________________________________________________________________________________________________
permute_35 (Permute) (None, 3, 4, 12) 0 ssd_conf_5[0][0]
__________________________________________________________________________________________________
ssd_loc_0 (Conv2D) (None, 24, 96, 128) 27672 activation_23[0][0]
__________________________________________________________________________________________________
ssd_loc_1 (Conv2D) (None, 24, 48, 64) 110616 activation_27[0][0]
__________________________________________________________________________________________________
ssd_loc_2 (Conv2D) (None, 24, 24, 32) 110616 additional_map0_1_relu[0][0]
__________________________________________________________________________________________________
ssd_loc_3 (Conv2D) (None, 24, 12, 16) 55320 additional_map1_1_relu[0][0]
__________________________________________________________________________________________________
ssd_loc_4 (Conv2D) (None, 24, 6, 8) 55320 additional_map2_1_relu[0][0]
__________________________________________________________________________________________________
ssd_loc_5 (Conv2D) (None, 24, 3, 4) 55320 additional_map3_1_relu[0][0]
__________________________________________________________________________________________________
conf_reshape_0 (Reshape) (None, 73728, 1, 2) 0 permute_25[0][0]
__________________________________________________________________________________________________
conf_reshape_1 (Reshape) (None, 18432, 1, 2) 0 permute_27[0][0]
__________________________________________________________________________________________________
conf_reshape_2 (Reshape) (None, 4608, 1, 2) 0 permute_29[0][0]
__________________________________________________________________________________________________
conf_reshape_3 (Reshape) (None, 1152, 1, 2) 0 permute_31[0][0]
__________________________________________________________________________________________________
conf_reshape_4 (Reshape) (None, 288, 1, 2) 0 permute_33[0][0]
__________________________________________________________________________________________________
conf_reshape_5 (Reshape) (None, 72, 1, 2) 0 permute_35[0][0]
__________________________________________________________________________________________________
permute_26 (Permute) (None, 96, 128, 24) 0 ssd_loc_0[0][0]
__________________________________________________________________________________________________
permute_28 (Permute) (None, 48, 64, 24) 0 ssd_loc_1[0][0]
__________________________________________________________________________________________________
permute_30 (Permute) (None, 24, 32, 24) 0 ssd_loc_2[0][0]
__________________________________________________________________________________________________
permute_32 (Permute) (None, 12, 16, 24) 0 ssd_loc_3[0][0]
__________________________________________________________________________________________________
permute_34 (Permute) (None, 6, 8, 24) 0 ssd_loc_4[0][0]
__________________________________________________________________________________________________
permute_36 (Permute) (None, 3, 4, 24) 0 ssd_loc_5[0][0]
__________________________________________________________________________________________________
ssd_anchor_0 (AnchorBoxes) (None, 12288, 6, 8) 0 ssd_loc_0[0][0]
__________________________________________________________________________________________________
ssd_anchor_1 (AnchorBoxes) (None, 3072, 6, 8) 0 ssd_loc_1[0][0]
__________________________________________________________________________________________________
ssd_anchor_2 (AnchorBoxes) (None, 768, 6, 8) 0 ssd_loc_2[0][0]
__________________________________________________________________________________________________
ssd_anchor_3 (AnchorBoxes) (None, 192, 6, 8) 0 ssd_loc_3[0][0]
__________________________________________________________________________________________________
ssd_anchor_4 (AnchorBoxes) (None, 48, 6, 8) 0 ssd_loc_4[0][0]
__________________________________________________________________________________________________
ssd_anchor_5 (AnchorBoxes) (None, 12, 6, 8) 0 ssd_loc_5[0][0]
__________________________________________________________________________________________________
mbox_conf (Concatenate) (None, 98280, 1, 2) 0 conf_reshape_0[0][0]
conf_reshape_1[0][0]
conf_reshape_2[0][0]
conf_reshape_3[0][0]
conf_reshape_4[0][0]
conf_reshape_5[0][0]
__________________________________________________________________________________________________
loc_reshape_0 (Reshape) (None, 73728, 1, 4) 0 permute_26[0][0]
__________________________________________________________________________________________________
loc_reshape_1 (Reshape) (None, 18432, 1, 4) 0 permute_28[0][0]
__________________________________________________________________________________________________
loc_reshape_2 (Reshape) (None, 4608, 1, 4) 0 permute_30[0][0]
__________________________________________________________________________________________________
loc_reshape_3 (Reshape) (None, 1152, 1, 4) 0 permute_32[0][0]
__________________________________________________________________________________________________
loc_reshape_4 (Reshape) (None, 288, 1, 4) 0 permute_34[0][0]
__________________________________________________________________________________________________
loc_reshape_5 (Reshape) (None, 72, 1, 4) 0 permute_36[0][0]
__________________________________________________________________________________________________
anchor_reshape_0 (Reshape) (None, 73728, 1, 8) 0 ssd_anchor_0[0][0]
__________________________________________________________________________________________________
anchor_reshape_1 (Reshape) (None, 18432, 1, 8) 0 ssd_anchor_1[0][0]
__________________________________________________________________________________________________
anchor_reshape_2 (Reshape) (None, 4608, 1, 8) 0 ssd_anchor_2[0][0]
__________________________________________________________________________________________________
anchor_reshape_3 (Reshape) (None, 1152, 1, 8) 0 ssd_anchor_3[0][0]
__________________________________________________________________________________________________
anchor_reshape_4 (Reshape) (None, 288, 1, 8) 0 ssd_anchor_4[0][0]
__________________________________________________________________________________________________
anchor_reshape_5 (Reshape) (None, 72, 1, 8) 0 ssd_anchor_5[0][0]
__________________________________________________________________________________________________
mbox_conf_sigmoid (Activation) (None, 98280, 1, 2) 0 mbox_conf[0][0]
__________________________________________________________________________________________________
mbox_loc (Concatenate) (None, 98280, 1, 4) 0 loc_reshape_0[0][0]
loc_reshape_1[0][0]
loc_reshape_2[0][0]
loc_reshape_3[0][0]
loc_reshape_4[0][0]
loc_reshape_5[0][0]
__________________________________________________________________________________________________
mbox_priorbox (Concatenate) (None, 98280, 1, 8) 0 anchor_reshape_0[0][0]
anchor_reshape_1[0][0]
anchor_reshape_2[0][0]
anchor_reshape_3[0][0]
anchor_reshape_4[0][0]
anchor_reshape_5[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate) (None, 98280, 1, 14) 0 mbox_conf_sigmoid[0][0]
mbox_loc[0][0]
mbox_priorbox[0][0]
__________________________________________________________________________________________________
ssd_predictions (Reshape) (None, 98280, 14) 0 concatenate_3[0][0]
==================================================================================================
Total params: 13,769,880
Trainable params: 13,754,520
Non-trainable params: 15,360
__________________________________________________________________________________________________
WARNING:tensorflow:From ./ssd/box_coder/output_decoder_layer.py:83: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2019-11-05 16:39:29,109 [WARNING] tensorflow: From ./ssd/box_coder/output_decoder_layer.py:83: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
0%| | 0/133 [00:00<?, ?it/s]2019-11-05 16:39:30.768956: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-11-05 16:39:30.777215: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
0%| | 0/133 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/usr/local/bin/tlt-infer", line 10, in <module>
sys.exit(main())
File "./common/magnet_infer.py", line 32, in main
File "./ssd/scripts/inference.py", line 173, in main
File "./ssd/scripts/inference.py", line 141, in inference
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1169, in predict
steps=steps)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training_arrays.py", line 294, in predict_loop
batch_outs = f(ins_batch)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1439, in __call__
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node model_1/conv1/convolution}}]]