Hi
I tested MNIST with TensorRT 2.1.2 and got below results:
INT8 run:400 batches of size 100 starting at 100
…
Top1: 0.9918, Top5: 1
Processing 40000 images averaged 0.00144474 ms/image and 0.144474 ms/batch.
FP32 run:400 batches of size 100 starting at 100
…
Top1: 0.9918, Top5: 1
Processing 40000 images averaged 0.00220669 ms/image and 0.220669 ms/batch.
I have some questions:
-
Why does TensorRT runs inference on 40000 images ? Shouldn’t it run only on test dataset of MNIST i.e 10000 images ?
-
Why there is not 3x to 4x improvement in time ms/batch ? Is it due to network size is small of MNIST ?