Cannot run sampleUffSSD for INT8 inference, Calibration failed

Hello,

I try to run the sampleUffSSD example in samples/folder into INT8 mode. I followed the steps given in the README.txt file but I get the error while calibration for second batch. It somehow cannot read the (list.txt) file properly and tries to search for (.ppm) file which ofcourse doesn’t exist.

  1. In /data/ssd folder, I placed 550 .ppm images with resized to 300*300
  2. In /data/ssd folder, I placed the list.txt file containing the names of all the ppm images(without …ppm extension).
  3. Run the ./sample_uff_ssd --int8 , then it gives the error in calibration for the second batch which unexpectedly tries to read “.ppm” image which doesn’t exist.

adit@gibson2:~/Downloads/TensorRT-4.0.1.6/bin$ ./sample_uff_ssd --int8
…/…/…/data/ssd/sample_ssd.uff
Begin parsing model…
End parsing model…
Begin building engine…
Batch #0 Mbatchsize #50 mBatchCount #0
Calibrating with file 000000000139.ppm
Calibrating with file 000000000285.ppm
Calibrating with file 000000000632.ppm
Calibrating with file 000000000724.ppm
Calibrating with file 000000000776.ppm
Calibrating with file 000000000785.ppm
Calibrating with file 000000000802.ppm
Calibrating with file 000000000872.ppm
Calibrating with file 000000000885.ppm
Calibrating with file 000000001000.ppm
Calibrating with file 000000001268.ppm
Calibrating with file 000000001296.ppm
Calibrating with file 000000001353.ppm
Calibrating with file 000000001425.ppm
Calibrating with file 000000001490.ppm
Calibrating with file 000000001503.ppm
Calibrating with file 000000001532.ppm
Calibrating with file 000000001584.ppm
Calibrating with file 000000001675.ppm
Calibrating with file 000000001761.ppm
Calibrating with file 000000001818.ppm
Calibrating with file 000000001993.ppm
Calibrating with file 000000002006.ppm
Calibrating with file 000000002149.ppm
Calibrating with file 000000002153.ppm
Calibrating with file 000000002157.ppm
Calibrating with file 000000002261.ppm
Calibrating with file 000000002299.ppm
Calibrating with file 000000002431.ppm
Calibrating with file 000000002473.ppm
Calibrating with file 000000002532.ppm
Calibrating with file 000000002587.ppm
Calibrating with file 000000002592.ppm
Calibrating with file 000000002685.ppm
Calibrating with file 000000002923.ppm
Calibrating with file 000000003156.ppm
Calibrating with file 000000003255.ppm
Calibrating with file 000000003501.ppm
Calibrating with file 000000003553.ppm
Calibrating with file 000000003661.ppm
Calibrating with file 000000003845.ppm
Calibrating with file 000000003934.ppm
Calibrating with file 000000004134.ppm
Calibrating with file 000000004395.ppm
Calibrating with file 000000004495.ppm
Calibrating with file 000000004765.ppm
Calibrating with file 000000004795.ppm
Calibrating with file 000000005001.ppm
Calibrating with file 000000005037.ppm
Calibrating with file 000000005060.ppm
Batch #1 Mbatchsize #50 mBatchCount #1
Calibrating with file .ppm
Calibrating with file 000000002299.ppm
Calibrating with file 000000002431.ppm
Calibrating with file 000000002473.ppm
Calibrating with file 000000002532.ppm
Calibrating with file 000000002587.ppm
Calibrating with file 000000002592.ppm
Calibrating with file 000000002685.ppm
Calibrating with file 000000002923.ppm
Calibrating with file 000000003156.ppm
Calibrating with file 000000003255.ppm
Calibrating with file 000000003501.ppm
Calibrating with file 000000003553.ppm
Calibrating with file 000000003661.ppm
Calibrating with file 000000003845.ppm
Calibrating with file 000000003934.ppm
Calibrating with file 000000004134.ppm
Calibrating with file 000000004395.ppm
Calibrating with file 000000004495.ppm
Calibrating with file 000000004765.ppm
Calibrating with file 000000004795.ppm
Calibrating with file 000000005001.ppm
Calibrating with file 000000005037.ppm
Calibrating with file 000000005060.ppm
Calibrating with file 000000005193.ppm
Calibrating with file 000000005477.ppm
Calibrating with file 000000005503.ppm
Calibrating with file 000000005529.ppm
Calibrating with file 000000005586.ppm
Calibrating with file 000000005600.ppm
Calibrating with file 000000005992.ppm
Calibrating with file 000000006012.ppm
Calibrating with file 000000006040.ppm
Calibrating with file 000000006213.ppm
Calibrating with file 000000006460.ppm
Calibrating with file 000000006471.ppm
Calibrating with file 000000006614.ppm
Calibrating with file 000000006723.ppm
Calibrating with file 000000006763.ppm
Calibrating with file 000000006771.ppm
Calibrating with file 000000006818.ppm
Calibrating with file 000000006894.ppm
Calibrating with file 000000006954.ppm
Calibrating with file 000000007088.ppm
Calibrating with file 000000007108.ppm
Calibrating with file 000000007278.ppm
Calibrating with file 000000007281.ppm
Calibrating with file 000000007386.ppm
Calibrating with file 000000007511.ppm
Calibrating with file 000000007574.ppm
terminate called after throwing an instance of ‘std::runtime_error’
what(): Could not find .ppm in data directories:
data/ssd/
data/ssd/VOC2007/
data/ssd/VOC2007/PPMImages/
data/samples/ssd/
data/samples/ssd/VOC2007/
data/samples/ssd/VOC2007/PPMImages/
Aborted (core dumped)


Attached snapshots of my directory structure and list.txt file.

@adit_bhrgv, for some reason they decided to hard code the file name length in BatchStreamPPM.h update(). You need to have each file name be the same length and then update this line accordingly:
file.seekg(((mBatchCount * mBatchSize))*14); // Updated to 14 for my files

Without this it will foob after the first batch.

However, even though it reads in all 500 of my images, my calibration still fails with this error:
“ERROR: Tensor FeatureExtractor/InceptionV2/InceptionV2/Mixed_4c/Branch_3/Conv2d_0b_1x1/Relu6/relu2 is uniformly zero; network calibration failed.”

Which makes me think my calibration images aren’t good enough? I’m not sure exactly. Let me know if your calibration succeeds or if anyone has any other suggestion.

@joe-dev Thanks for your reply.
I set file.seekg(((mBatchCount * mBatchSize))*50); // Updated to 50 for my files as I have CAL_BATCH_SIZE= 50 files in NB_CAL_BATCHES=11 batches.
Still, the BatchStreamPPM.h can’t parse my files. I have all the names of the ppm files of same length but this parser automatically skips some part of the file name and then it cries for “could not find 00019402.ppm”.

THe actual file name is 000000019402.ppm.

PLease see logs.

adit@gibson2:~/Downloads/TensorRT-4.0.1.6/samples/sampleUffSSD$ …/…/bin/sample_uff_ssd --int8
…/…/…/…/data/ssd/sample_ssd.uff
Begin parsing model…
End parsing model…
Begin building engine…
Batch #0 Mbatchsize #50 mBatchCount #0
Calibrating with file 000000000139.ppm
Calibrating with file 000000000285.ppm
Calibrating with file 000000000632.ppm
Calibrating with file 000000000724.ppm
Calibrating with file 000000000776.ppm
Calibrating with file 000000000785.ppm
Calibrating with file 000000000802.ppm
Calibrating with file 000000000872.ppm
Calibrating with file 000000000885.ppm
Calibrating with file 000000001000.ppm
Calibrating with file 000000001268.ppm
Calibrating with file 000000001296.ppm
Calibrating with file 000000001353.ppm
Calibrating with file 000000001425.ppm
Calibrating with file 000000001490.ppm
Calibrating with file 000000001503.ppm
Calibrating with file 000000001532.ppm
Calibrating with file 000000001584.ppm
Calibrating with file 000000001675.ppm
Calibrating with file 000000001761.ppm
Calibrating with file 000000001818.ppm
Calibrating with file 000000001993.ppm
Calibrating with file 000000002006.ppm
Calibrating with file 000000002149.ppm
Calibrating with file 000000002153.ppm
Calibrating with file 000000002157.ppm
Calibrating with file 000000002261.ppm
Calibrating with file 000000002299.ppm
Calibrating with file 000000002431.ppm
Calibrating with file 000000002473.ppm
Calibrating with file 000000002532.ppm
Calibrating with file 000000002587.ppm
Calibrating with file 000000002592.ppm
Calibrating with file 000000002685.ppm
Calibrating with file 000000002923.ppm
Calibrating with file 000000003156.ppm
Calibrating with file 000000003255.ppm
Calibrating with file 000000003501.ppm
Calibrating with file 000000003553.ppm
Calibrating with file 000000003661.ppm
Calibrating with file 000000003845.ppm
Calibrating with file 000000003934.ppm
Calibrating with file 000000004134.ppm
Calibrating with file 000000004395.ppm
Calibrating with file 000000004495.ppm
Calibrating with file 000000004765.ppm
Calibrating with file 000000004795.ppm
Calibrating with file 000000005001.ppm
Calibrating with file 000000005037.ppm
Calibrating with file 000000005060.ppm
Batch #1 Mbatchsize #50 mBatchCount #1
Calibrating with file 00019402.ppm
Calibrating with file 000000019432.ppm
Calibrating with file 000000019742.ppm
Calibrating with file 000000019786.ppm
Calibrating with file 000000019924.ppm
Calibrating with file 000000020059.ppm
Calibrating with file 000000020107.ppm
Calibrating with file 000000020247.ppm
Calibrating with file 000000020333.ppm
Calibrating with file 000000020553.ppm
Calibrating with file 000000020571.ppm
Calibrating with file 000000020992.ppm
Calibrating with file 000000021167.ppm
Calibrating with file 000000021465.ppm
Calibrating with file 000000021503.ppm
Calibrating with file 000000021604.ppm
Calibrating with file 000000021839.ppm
Calibrating with file 000000021879.ppm
Calibrating with file 000000021903.ppm
Calibrating with file 000000022192.ppm
Calibrating with file 000000022371.ppm
Calibrating with file 000000022396.ppm
Calibrating with file 000000022479.ppm
Calibrating with file 000000022589.ppm
Calibrating with file 000000022623.ppm
Calibrating with file 000000022705.ppm
Calibrating with file 000000022755.ppm
Calibrating with file 000000022892.ppm
Calibrating with file 000000022935.ppm
Calibrating with file 000000022969.ppm
Calibrating with file 000000023023.ppm
Calibrating with file 000000023034.ppm
Calibrating with file 000000023126.ppm
Calibrating with file 000000023230.ppm
Calibrating with file 000000023272.ppm
Calibrating with file 000000023359.ppm
Calibrating with file 000000023666.ppm
Calibrating with file 000000023751.ppm
Calibrating with file 000000023781.ppm
Calibrating with file 000000023899.ppm
Calibrating with file 000000023937.ppm
Calibrating with file 000000024021.ppm
Calibrating with file 000000024027.ppm
Calibrating with file 000000024144.ppm
Calibrating with file 000000024243.ppm
Calibrating with file 000000024567.ppm
Calibrating with file 000000024610.ppm
Calibrating with file 000000024919.ppm
Calibrating with file 000000025057.ppm
Calibrating with file 000000025096.ppm
terminate called after throwing an instance of ‘std::runtime_error’
what(): Could not find 00019402.ppm in data directories:
data/ssd/
data/ssd/VOC2007/
data/ssd/VOC2007/PPMImages/
data/samples/ssd/
data/samples/ssd/VOC2007/
data/samples/ssd/VOC2007/PPMImages/
Aborted (core dumped)

OOps…I guess you meant the length of the file names as 14…?

@adit_bhrgv
The number is the length of the file name in list.txt. So my file is 000000_000000.ppm, in list.txt it’s 000000_000000. So 13+1 for new line and my number is 14.

Yours should be 13 I think (12+1 for newline)

@adit_bhrgv
Did that work for you? Were you able to get your calibration to succeed?

@joe-dev - Yes the calibration succeeded with my setup. However, I used 10 batches of 50 images each.I am able to get inference speedup of 3 ms in int8.

FP32 inference time: ~ 9 ms
INT8 inference time: ~ 6 ms

I am running on GTX 1080Ti. Which GPU are you using ?Also Have you resized all your images to 300X300?

@adit_bhrgv
I’m using 10 batches of 50 images each (500 unique images related to my desired test case) They have all been resized to 300x300 ppm images (converted to ppm, scaled, then cropped)

I’m running on a 1050Ti. FP32 inference works on the images as well.

For timing I’m getting FP32 inference at about 10.5 ms

Thanks for the inference numbers, that’s great to see!

Here’s the error again:
ERROR: Tensor FeatureExtractor/InceptionV2/InceptionV2/Mixed_4c/Branch_3/Conv2d_0b_1x1/Relu6/relu2 is uniformly zero; network calibration failed.

Maybe I need a wider range of images?

@joe-dev- I tried with the validation images from MSCOCO dataset randomly selected 500 images.Maybe you can also try that.

Also your error seems to be coming from the PriorBox_3 layer, in my case, this does not create a problem. Maybe your config file has some changes.


Snapshot of my CalibrationSSD file:

FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_2/Conv2d_0c_3x3/Relu6: 3d418f1e
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_2/Conv2d_0c_3x3/Relu6/relu1: 3d8c4c2a
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4d/Branch_1/Conv2d_0a_1x1/Relu6: 3d418f1e
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_2/Conv2d_0b_3x3/Relu6/relu1: 3dde39f8
FeatureExtractor/InceptionV2/Mixed_5c_2_Conv2d_5_3x3_s2_128/Relu6/relu1: 3db2b396
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_2/Conv2d_0b_3x3/BatchNorm/FusedBatchNorm: 3e43b33d
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4c/Branch_0/Conv2d_0a_1x1/BatchNorm/FusedBatchNorm: 3d3833cf
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_2/Conv2d_0a_1x1/Relu6/relu1: 3d9223f0
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_1/Conv2d_0b_3x3/Relu6: 3d418f1e
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5b/Branch_2/Conv2d_0b_3x3/Relu6: 3d418f1e
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_1/Conv2d_0b_3x3/Relu6/relu1: 3dae29b8
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_1/Conv2d_0a_1x1/Relu6/relu2: 3db6c667
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4e/Branch_3/AvgPool_0a_3x3/AvgPool: 3d112e5d
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_1/Conv2d_0a_1x1/Relu6: 3d418f1e
Squeeze_1: 3d43808d
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_1/Conv2d_0a_1x1/Relu6/relu1: 3dd37043
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_0/Conv2d_0a_1x1/Relu6: 3d418f1e
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4b/Branch_1/Conv2d_0a_1x1/BatchNorm/FusedBatchNorm: 3d99e231
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_0/Conv2d_0a_1x1/Relu6/relu2: 3d8d85e6
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_3/Conv2d_0b_1x1/BatchNorm/FusedBatchNorm: 3e207539
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_0/Conv2d_0a_1x1/Relu6/relu1: 3d76d030
FeatureExtractor/InceptionV2/InceptionV2/Mixed_3b/Branch_3/Conv2d_0b_1x1/Relu6/relu1: 3d26e1a7
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_2/Conv2d_0a_1x1/BatchNorm/FusedBatchNorm: 3e92b0c5
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5a/Branch_0/Conv2d_0a_1x1/Relu6/relu1: 3d8ef429
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4d/Branch_1/Conv2d_0b_3x3/BatchNorm/FusedBatchNorm: 3dae78ac
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_0/Conv2d_0a_1x1/BatchNorm/FusedBatchNorm: 3e2ea510
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_2/Conv2d_0b_3x3/Relu6/relu2: 3d9c3293
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5b/Branch_3/Conv2d_0b_1x1/BatchNorm/FusedBatchNorm: 3dee5b19
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4a/Branch_1/Conv2d_1a_3x3/Relu6: 3d418f1e
FeatureExtractor/InceptionV2/InceptionV2/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu6/relu1: 3d8f2ce1
PriorBox_3: 3c348982
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5a/Branch_1/Conv2d_0b_3x3/Relu6/relu1: 3dc51b59
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4d/Branch_3/Conv2d_0b_1x1/Relu6/relu1: 3d5e67f9
FeatureExtractor/InceptionV2/Mixed_5c_1_Conv2d_5_1x1_64/Relu6: 3d3d4e9d
FeatureExtractor/InceptionV2/InceptionV2/Conv2d_2b_1x1/BatchNorm/FusedBatchNorm: 3e57b0d0
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4c/Branch_3/Conv2d_0b_1x1/Relu6/relu1: 3d1de579
FeatureExtractor/InceptionV2/Mixed_5c_1_Conv2d_5_1x1_64/Relu6/relu1: 3d3d4686
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4c/Branch_2/Conv2d_0b_3x3/Relu6/relu1: 3df180dc
FeatureExtractor/InceptionV2/InceptionV2/Mixed_5b/Branch_2/Conv2d_0b_3x3/BatchNorm/FusedBatchNorm: 3dcc3367
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4a/Branch_1/Conv2d_0b_3x3/Relu6/relu1: 3d974327
FeatureExtractor/InceptionV2/InceptionV2/Mixed_4c/Branch_3/Conv2d_0b_1x1/Relu6/relu2: 3c14eb6f
concat_box_conf: 3e2734a6

ALso, one another question: Have you wondered if how can we convert another models like ssd_mobilenet_v2 into TensorRT(including Plugin layers) from Tensorflow and how to change the config file. Does anyone has already did that here if you know of ?

THanks

Hi adit_bhrgv,

Can you share your calibration images in google drive or dropbox?
I got the similar problem with Joe… It is not the configuration problem for sure.

Hi liuluyang.colin90,

I can share my calibration images today on Gdrive till evening or tomorrow morning.

Thanks

Hi adit_bhrgv,

Thank you so much. Let me know after you uploaded.

Best.

Hi adit_bhrgv,

Can you share the images with me too?

I also have the problem of “ERROR: Tensor FeatureExtractor/InceptionV2/InceptionV2/Mixed_4c/Branch_3/Conv2d_0b_1x1/Relu6/relu2 is uniformly zero; network calibration failed.”

By the way, what is the “frozen_inference_graph” you were using? Is it “ssd_inception_v2_coco_2018_01_28”, “ssd_inception_v2_coco_2017_11_17” or earlier version?

Best,

@adit_bhrgv

Can you share the images with me too?
where is your gdrive link?

Did that work very well for you?

INT8 mode was tested, right now.
The calibration problem is that ppm image files should be converted like this way.
$ convert foo.jpg bar.ppm
$ head -n 3 *.ppm
==> bus.ppm <==
P6
300 300
255

==> cat.ppm <==
P6
300 300
255

==> dog.ppm <==
P6
300 300
255

The head info of ppm file MUST BE “P6\r300 300\r255\r”.
SKIP the ppm files which are not satisfied.

1050Ti
$ ./sample_uff_ssd
Time taken for inference is 39.2984 ms 3 images
$ ./sample_uff_ssd --int8
Time taken for inference is 20.6677 ms. 3 images

Even did the check with head -n 3 *.ppm, it still gives:

ERROR: Tensor FeatureExtractor/InceptionV2/InceptionV2/Mixed_5c/Branch_3/Conv2d_0b_1x1/Relu6/relu2 is uniformly zero; network calibration failed.
sample_uff_ssd: …/builder/cudnnBuilder2.cpp:1227: nvinfer1::cudnn::Engine* nvinfer1::builder::buildEngine(nvinfer1::CudaEngineBuildConfig&, const nvinfer1::cudnn::HardwareContext&, const nvinfer1::Network&): Assertion `it != tensorScales.end()’ failed.

I try to change the CAL_BATCH_SIZE as 1, NB_CAL_BATCHES as 2, and use the “dog.ppm” and “cat.ppm”, it still give the same errors.

@xmgeek
I am also using 1050Ti. I doubt this problem is due to the cuda/cudnn/tensorrt/tensorflow intception v2 model. Could you please share those information on your machine? Right now i am using “ssd_inception_v2_coco_2017_11_17”.