Object Detection with MobileNet-SSD slower than mentioned speed

bobzeng · April 9, 2019, 8:59am

Hi, I followed the guide in this project to setup caffe on nano.

I trained my own model and tried to detect objects using USB camera.
The speed can only reach 4.7FPS which is far away from the mentioned speed in the following link.
https://devblogs.nvidia.com/jetson-nano-ai-computing/

Can you help verify the performance of nano? Thanks.

dusty_nv · April 9, 2019, 12:21pm

Hi bobzeng, the inferencing was performed using TensorRT. You can try using the trt-exec program to benchmark your model. Note that the model from the article is SSD-Mobilenet-V2.

elias_mir · April 9, 2019, 1:42pm

Dustin, how have you gotten SSD-Mobilenet-V2 to work in TensorRT? Do you have a sample somewhere?

dusty_nv · April 9, 2019, 3:32pm

Hi elias_mir, it was converted from a TensorFlow model to UFF. Here are the directions to run the sample:

Copy the ssd-mobilenet-v2 archive from here to the ~/Downloads folder on Nano.

$ cd ~/Downloads/
$ wget --no-check-certificate 'https://nvidia.box.com/shared/static/8oqvmd79llr6lq1fr43s4fu1ph37v8nt.gz' -O ssd-mobilenet-v2.tar.gz
$ tar -xvf ssd-mobilenet-v2.tar.gz
$ cd ssd-mobilenet-v2
$ sudo cp -R sampleUffSSD_rect /usr/src/tensorrt/samples
$ sudo cp sample_unpruned_mobilenet_v2.uff /usr/src/tensorrt/data/ssd/
$ sudo cp image1.ppm /usr/src/tensorrt/data/ssd/

Compile the sample

$ cd /usr/src/tensorrt/samples/sampleUffSSD_rect
$ sudo make

Run the sample to measure inference performance

$ sudo jetson_clocks
$ cd /usr/src/tensorrt/bin
$ sudo ./sample_uff_ssd_rect

elias_mir · April 10, 2019, 8:11am

dusty_nv:

Dustin, how have you gotten SSD-Mobilenet-V2 to work in TensorRT? Do you have a sample somewhere?

Hi elias_mir, it was converted from a TensorFlow model to UFF. Here are the directions to run the sample:
Copy the ssd-mobilenet-v2 archive from here to the ~/Downloads folder on Nano.
$ cd ~/Downloads/
$ wget --no-check-certificate 'https://nvidia.box.com/shared/static/8oqvmd79llr6lq1fr43s4fu1ph37v8nt.gz' -O ssd-mobilenet-v2.tar.gz
$ tar -xvf ssd-mobilenet-v2.tar.gz
$ cd ssd-mobilenet-v2
$ sudo cp -R sampleUffSSD_rect /usr/src/tensorrt/samples
$ sudo cp sample_unpruned_mobilenet_v2.uff /usr/src/tensorrt/data/ssd/
$ sudo cp image1.ppm /usr/src/tensorrt/data/ssd/
Compile the sample
$ cd /usr/src/tensorrt/samples/sampleUffSSD_rect
$ sudo make
Run the sample to measure inference performance
$ sudo jetson_clocks
$ cd /usr/src/tensorrt/bin
$ sudo ./sample_uff_ssd_rect

Thank you very much! Looking forward to trying it.

sdu20112013 · April 17, 2019, 3:12am

dusty_nv:

Dustin, how have you gotten SSD-Mobilenet-V2 to work in TensorRT? Do you have a sample somewhere?

Hi elias_mir, it was converted from a TensorFlow model to UFF. Here are the directions to run the sample:
Copy the ssd-mobilenet-v2 archive from here to the ~/Downloads folder on Nano.
$ cd ~/Downloads/
$ wget --no-check-certificate 'https://nvidia.box.com/shared/static/8oqvmd79llr6lq1fr43s4fu1ph37v8nt.gz' -O ssd-mobilenet-v2.tar.gz
$ tar -xvf ssd-mobilenet-v2.tar.gz
$ cd ssd-mobilenet-v2
$ sudo cp -R sampleUffSSD_rect /usr/src/tensorrt/samples
$ sudo cp sample_unpruned_mobilenet_v2.uff /usr/src/tensorrt/data/ssd/
$ sudo cp image1.ppm /usr/src/tensorrt/data/ssd/
Compile the sample
$ cd /usr/src/tensorrt/samples/sampleUffSSD_rect
$ sudo make
Run the sample to measure inference performance
$ sudo jetson_clocks
$ cd /usr/src/tensorrt/bin
$ sudo ./sample_uff_ssd_rect

where sample_unpruned_mobilenet_v2.uff comes from? i use ssd_inception_v2_coco_2017_11_17,and transfer the model into .uff file. not the same as yours.

bobzeng · April 19, 2019, 2:03pm

Hi,

I followed the guide in this link to accelerate MobileNet-SSD with TensorRT.

The speed can reach 20FPS on Jetson Nano. By the way, the file main.cpp should be modified to fix the issue of memory leak.

The related MobileNet-SSD model can be trained according to the link:

Thanks for your support.

tomriddle · April 19, 2019, 11:16pm

How did you install caffe on jetson nano https://www.jetsonhacks.com/2017/03/24/caffe-deep-learning-framework-nvidia-jetson-tx2/ by this method can I install caffe or is there any other way, struggling from 3 days to get default model run on jetson nano

bobzeng · April 21, 2019, 8:19am

@cudanexus

Please follow the guide (Introduction of Caffe to environment with GPU) of the link to setup caffe on Jetson nano.

In addition, you should add a line in the CUDA_ARCH of the file Makefile.config:
CUDA_ARCH := -gencode arch=compute_53,code=sm_53

See also: https://developer.nvidia.com/cuda-gpus

mads.lorentzen · April 23, 2019, 1:22pm

Dustin where is this unpruned ssd mobilenet v2 model comming from? I would like use it too train it for my usecase! Kind regards!

dusty_nv · April 23, 2019, 3:31pm

Hi mads.lorentzen, it has been converted to from the TensorFlow model to UFF format (and then imported into TensorRT) with this kind of procedure:

mads.lorentzen · April 24, 2019, 6:54am

dusty_nv Thank you for the answer! I can see on the github profile there is three different mobilenet models! Where non of them is labeled with anything about unpruned. Can you be kind too clarify which of these models you have used for this sample?

Thank you!

Mads

air1kdf · April 25, 2019, 1:30am

mads,
I think that there are 14 models from the link that dusty provided. You need to do some conversion, which isn’t that big of a deal, and only needs to be done once.

cudanexus,
I can get 4.7 FPS from the CPU alone(via a secondary stream on one of my RTSP security cameras). And can get 27+FPS using tensorflow in python.

Unfortunately I don’t have a lot of time to play with this as my life is pretty busy at the moment, or I would post some code. Google or Bing is your friend. I just happened to see this post.

A cheap rtsp stream can be had for 25.99 for a WYZE cam2 on amazon. They just released a rtsp firmware.

Regards.

mads.lorentzen · April 29, 2019, 10:53am

I can also see that these mobilenet models are of version 1, but the example uses a mobilenet model of version 2? Do you have a folder with that model, which is possible too train on my own usecase?

Regards Mads

mads.lorentzen · April 29, 2019, 11:34am

Also I can see on the github that you use JetPack 3.2 for the TX2, but this is supposed to be used on the Jetson NANO, but the nano can’t run JetPack 3.2 and can’t therefore get Cuda9.0 which is needed for making this work?

moshe.livne · May 3, 2019, 3:54am

The 27+fps is for classification, right? not object detection.

ktktkt · May 3, 2019, 10:15am

I also want to know the 27 fps use case.
Classification? Object detection?
Which model?
Using pure tensorRT UFF?
Using tf-trt?

For me, the best figure is 13fps,
Tf-trt ssdmobilenetv1, python

neophyte1 · May 8, 2019, 10:52am

Hi All,

Could anyone please point me to samples_labels.txt which is required to make sense of the detections?

I tried running the model using the sample shared by dusty on images in the data folder of tensorrt. However, I am not able to get detections with good confidence score. For “dog.ppm”, the maximum confidence score is 0.22 (22%) and for “bus.ppm”, the maximum confidence score is 0.03 (3%).

Is there something I am missing out. Please help me out.

Thanks

AastaLLL · May 10, 2019, 8:10am

Hi,

May I know which sample do you test first?

For classification, you can find the label file here:
[url]https://github.com/dusty-nv/jetson-inference/blob/master/data/networks/ilsvrc12_synset_words.txt[/url]

For detection, have you updated the detector based on your problem?
If you want a dog detector, please use coco-dog rather than default.
[url]https://github.com/dusty-nv/jetson-inference/blob/master/docs/detectnet-console-2.md#locating-object-coordinates-using-detectnet[/url]

Thanks.

bethlis · June 1, 2019, 9:51am

Hey,

Anyone have successuful reached 20 fps with MobilnetSSDv2 with stream from Webcam with objet detection?

I can’t find anything like that. I just find 10 fps max (like OpenVino).