Hi all,
I have been testing with the same example as you did.
But after launching 3 times & having stable results, the following sessions were not stable at all.
Here is the result.
Do you know why ?
1st test
nvidia@tegra-ubuntu:/usr/src/tensorrt/bin$ ./giexec --deploy=fcn_alexnet.deploy.prototxt --output=score_fr_21classes --iterations=20
deploy: fcn_alexnet.deploy.prototxt
output: score_fr_21classes
iterations: 20
Input "data": 3x720x1280
Output "score_fr_21classes": 21x17x34
name=data, bindingIndex=0, buffers.size()=2
name=score_fr_21classes, bindingIndex=1, buffers.size()=2
Average over 10 runs is 25.0843 ms.
Average over 10 runs is 25.1267 ms.
Average over 10 runs is 25.1284 ms.
Average over 10 runs is 25.1203 ms.
Average over 10 runs is 25.1188 ms.
Average over 10 runs is 25.1187 ms.
Average over 10 runs is 25.1225 ms.
Average over 10 runs is 25.1531 ms.
Average over 10 runs is 25.1162 ms.
Average over 10 runs is 25.113 ms.
Average over 10 runs is 25.1134 ms.
Average over 10 runs is 25.1206 ms.
Average over 10 runs is 25.1131 ms.
Average over 10 runs is 25.1053 ms.
Average over 10 runs is 25.1135 ms.
Average over 10 runs is 25.1162 ms.
Average over 10 runs is 25.1138 ms.
Average over 10 runs is 25.1356 ms.
Average over 10 runs is 25.2261 ms.
Average over 10 runs is 25.124 ms.
2nd test
nvidia@tegra-ubuntu:/usr/src/tensorrt/bin$ ./giexec --deploy=fcn_alexnet.deploy.prototxt --output=score_fr_21classes --iterations=20
deploy: fcn_alexnet.deploy.prototxt
output: score_fr_21classes
iterations: 20
Input "data": 3x720x1280
Output "score_fr_21classes": 21x17x34
name=data, bindingIndex=0, buffers.size()=2
name=score_fr_21classes, bindingIndex=1, buffers.size()=2
Average over 10 runs is 25.0894 ms.
Average over 10 runs is 25.1149 ms.
Average over 10 runs is 25.0964 ms.
Average over 10 runs is 25.1101 ms.
Average over 10 runs is 25.0913 ms.
Average over 10 runs is 25.1021 ms.
Average over 10 runs is 25.0962 ms.
Average over 10 runs is 25.1053 ms.
Average over 10 runs is 25.0943 ms.
Average over 10 runs is 25.0939 ms.
Average over 10 runs is 25.0825 ms.
Average over 10 runs is 25.0944 ms.
Average over 10 runs is 25.0849 ms.
Average over 10 runs is 25.1064 ms.
Average over 10 runs is 25.0914 ms.
Average over 10 runs is 25.0953 ms.
Average over 10 runs is 25.0917 ms.
Average over 10 runs is 25.1148 ms.
Average over 10 runs is 25.092 ms.
Average over 10 runs is 25.0918 ms.
3rd test
nvidia@tegra-ubuntu:/usr/src/tensorrt/bin$ ./giexec --deploy=fcn_alexnet.deploy.prototxt --output=score_fr_21classes --iterations=20
deploy: fcn_alexnet.deploy.prototxt
output: score_fr_21classes
iterations: 20
Input "data": 3x720x1280
Output "score_fr_21classes": 21x17x34
name=data, bindingIndex=0, buffers.size()=2
name=score_fr_21classes, bindingIndex=1, buffers.size()=2
Average over 10 runs is 25.1215 ms.
Average over 10 runs is 25.1118 ms.
Average over 10 runs is 25.1063 ms.
Average over 10 runs is 25.1124 ms.
Average over 10 runs is 25.1224 ms.
Average over 10 runs is 25.1232 ms.
Average over 10 runs is 25.1098 ms.
Average over 10 runs is 25.098 ms.
Average over 10 runs is 25.1085 ms.
Average over 10 runs is 25.1032 ms.
Average over 10 runs is 25.1056 ms.
Average over 10 runs is 25.1097 ms.
Average over 10 runs is 25.1086 ms.
Average over 10 runs is 25.1098 ms.
Average over 10 runs is 25.1187 ms.
Average over 10 runs is 25.108 ms.
Average over 10 runs is 25.092 ms.
Average over 10 runs is 25.1126 ms.
Average over 10 runs is 25.1069 ms.
Average over 10 runs is 25.0928 ms.
and then from 4th test, it’s weird
nvidia@tegra-ubuntu:/usr/src/tensorrt/bin$ ./giexec --deploy=fcn_alexnet.deploy.prototxt --output=score_fr_21classes --iterations=20
deploy: fcn_alexnet.deploy.prototxt
output: score_fr_21classes
iterations: 20
Input "data": 3x720x1280
Output "score_fr_21classes": 21x17x34
name=data, bindingIndex=0, buffers.size()=2
name=score_fr_21classes, bindingIndex=1, buffers.size()=2
Average over 10 runs is 36.2181 ms.
Average over 10 runs is 36.2666 ms.
Average over 10 runs is 36.4036 ms.
Average over 10 runs is 36.5466 ms.
Average over 10 runs is 28.5801 ms.
Average over 10 runs is 17.7351 ms.
Average over 10 runs is 17.7327 ms.
Average over 10 runs is 17.7302 ms.
Average over 10 runs is 17.7099 ms.
Average over 10 runs is 17.7183 ms.
Average over 10 runs is 17.7139 ms.
Average over 10 runs is 17.7292 ms.
Average over 10 runs is 17.7235 ms.
Average over 10 runs is 17.7264 ms.
Average over 10 runs is 17.7215 ms.
Average over 10 runs is 17.7217 ms.
Average over 10 runs is 17.7211 ms.
Average over 10 runs is 17.7591 ms.
Average over 10 runs is 17.7206 ms.
Average over 10 runs is 17.7162 ms.
5th test
nvidia@tegra-ubuntu:/usr/src/tensorrt/bin$ ./giexec --deploy=fcn_alexnet.deploy.prototxt --output=score_fr_21classes --iterations=20
deploy: fcn_alexnet.deploy.prototxt
output: score_fr_21classes
iterations: 20
Input "data": 3x720x1280
Output "score_fr_21classes": 21x17x34
name=data, bindingIndex=0, buffers.size()=2
name=score_fr_21classes, bindingIndex=1, buffers.size()=2
Average over 10 runs is 29.4871 ms.
Average over 10 runs is 29.5042 ms.
Average over 10 runs is 29.6686 ms.
Average over 10 runs is 29.6803 ms.
Average over 10 runs is 29.6761 ms.
Average over 10 runs is 29.6828 ms.
Average over 10 runs is 29.691 ms.
Average over 10 runs is 29.6844 ms.
Average over 10 runs is 29.6635 ms.
Average over 10 runs is 29.6998 ms.
Average over 10 runs is 29.6705 ms.
Average over 10 runs is 29.6869 ms.
Average over 10 runs is 29.673 ms.
Average over 10 runs is 29.6667 ms.
Average over 10 runs is 29.665 ms.
Average over 10 runs is 29.6779 ms.
Average over 10 runs is 29.6836 ms.
Average over 10 runs is 29.7064 ms.
Average over 10 runs is 29.6848 ms.
Average over 10 runs is 29.6697 ms.
and last one
nvidia@tegra-ubuntu:/usr/src/tensorrt/bin$ ./giexec --deploy=fcn_alexnet.deploy.prototxt --output=score_fr_21classes --iterations=20
deploy: fcn_alexnet.deploy.prototxt
output: score_fr_21classes
iterations: 20
Input "data": 3x720x1280
Output "score_fr_21classes": 21x17x34
name=data, bindingIndex=0, buffers.size()=2
name=score_fr_21classes, bindingIndex=1, buffers.size()=2
Average over 10 runs is 36.2083 ms.
Average over 10 runs is 36.3211 ms.
Average over 10 runs is 36.3594 ms.
Average over 10 runs is 36.5478 ms.
Average over 10 runs is 36.6804 ms.
Average over 10 runs is 36.7456 ms.
Average over 10 runs is 36.76 ms.
Average over 10 runs is 36.6153 ms.
Average over 10 runs is 36.6486 ms.
Average over 10 runs is 36.6631 ms.
Average over 10 runs is 36.6494 ms.
Average over 10 runs is 36.6599 ms.
Average over 10 runs is 36.6613 ms.
Average over 10 runs is 36.6458 ms.
Average over 10 runs is 36.658 ms.
Average over 10 runs is 36.6578 ms.
Average over 10 runs is 36.6765 ms.
Average over 10 runs is 36.6538 ms.
Average over 10 runs is 36.6606 ms.
Average over 10 runs is 36.6612 ms.