Maximize PX2 performance by setting max frequency to CPU, GPU

How to maximize PX2 performance like using jetson_clock on jetson TX.

I run a deep learning code with tensorflow. When I use TX2, network process takes 80ms. But on PX2, it takes 50ms. I think PX2 is 5 times faster than TX2 because TX1 GFLOPS is 1.5TFLOPS but PX2 is 8TFLOPS.

Is there how to maximize performance of PX2 or things to set to use tensorflow.

Hi basoc5002,
Are you doing inference on integrated GPU or discrete GPU?

I don’t know GPU system about PX2. So I don’t know which GPU it runs on.
When i run the inference code, two GPUs is created by Tensorflow device.

2017-11-14 17:43:21.899355: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) → (device: 0, name: GP106, pci bus id: 0000:04:00.0)
2017-11-14 17:43:21.899427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:1) → (device: 1, name: GP10B, pci bus id: 0000:00:00.0)

Dear basoc5002,
DrivePX2 has an integrated GPU which is same as TX2’s GPU and has a descrete GPU connected via PCIe.
Please set CUDA_VISIBLE_DEVICES = 0 or 1 to select dGPU or iGPU respectively. You can also select a specific GPU in tensorFlow code for inferencing.
By default, Drive PX2 GPU run at maximum frequency unlike Jetson board.

Dear basoc5002,

Please use the following command to check CPU/GPU usage.
$ sudo tegrastats

You can see below. GR3D_PCI is dGPU, GR3D is iGPU. Thanks.
RAM 2427/6660MB (lfb 185x4MB) cpu [0%@1997,0%@2035,0%@2035,0%@1997,0%@1997,0%@1997] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2607

Thanks for reply.

I try to set CUDA_VISIBLE_DEVICES = 0 and 1. When device is ‘0’, same performance as before. But when device is ‘0’, inference can’t run.

Hi basoc5002,
When device is 1, you notice same performance. Did you select device gpu0 from tensorflow code?
Can you check that?
The descrete GPU(gpu0) is ~4-5 times powerful compared to integrated GPU. Inferenencing using descrete GPU requires data transfer via PCIe.

Hi SteveNV, SivaRamaKrishna.

Sorry, I mentioned wrongly the device number.
Device 0 is same performance and device 1 can’t run inference.

It’s the screen to command ‘tegrastats’ when inference on device 0.

RAM 1685/6660MB (lfb 1037x4MB) cpu [0%@1996,0%@2035,0%@2035,0%@1997,0%@1996,0%@1997] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 1685/6660MB (lfb 1037x4MB) cpu [0%@1953,97%@2001,3%@2002,4%@1955,0%@1953,2%@1956] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 1687/6660MB (lfb 1037x4MB) cpu [12%@1996,47%@2035,54%@2034,3%@1997,7%@1995,6%@1997] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 1738/6660MB (lfb 1022x4MB) cpu [3%@1997,49%@2035,49%@2035,2%@1996,6%@1996,2%@1996] EMC 0%@1600 GR3D 1%@1275 GR3D_PCI 0%@2581
RAM 1847/6660MB (lfb 986x4MB) cpu [3%@1997,0%@2035,53%@2034,8%@1995,5%@1996,22%@1996] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 2%@2581
RAM 2044/6660MB (lfb 922x4MB) cpu [8%@1955,33%@1997,15%@1996,8%@1954,7%@1954,20%@1956] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 8%@2581
RAM 2199/6660MB (lfb 874x4MB) cpu [6%@1978,22%@2025,60%@2026,5%@1972,7%@1977,2%@1975] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 2404/6660MB (lfb 811x4MB) cpu [4%@1975,22%@2029,56%@2030,2%@1971,2%@1974,1%@1973] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 2488/6660MB (lfb 784x4MB) cpu [19%@1980,45%@2031,11%@2032,14%@1976,20%@1975,19%@1973] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 2489/6660MB (lfb 777x4MB) cpu [21%@1977,44%@2029,1%@2032,24%@1983,22%@1978,25%@1975] EMC 0%@1600 GR3D 2%@1275 GR3D_PCI 0%@2581
RAM 2489/6660MB (lfb 770x4MB) cpu [29%@1996,28%@2034,21%@2035,26%@1997,27%@1997,25%@1996] EMC 0%@1600 GR3D 6%@1275 GR3D_PCI 0%@2581
RAM 2490/6660MB (lfb 763x4MB) cpu [23%@1971,46%@2029,2%@2027,23%@1973,22%@1973,25%@1971] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 4%@2581
RAM 2490/6660MB (lfb 757x4MB) cpu [23%@1977,47%@2028,5%@2028,24%@1972,24%@1976,22%@1973] EMC 0%@1600 GR3D 7%@1275 GR3D_PCI 0%@2581
RAM 2490/6660MB (lfb 751x4MB) cpu [24%@1970,23%@2030,22%@2031,26%@1975,25%@1969,29%@1970] EMC 0%@1600 GR3D 5%@1275 GR3D_PCI 99%@2574
RAM 2490/6660MB (lfb 743x4MB) cpu [23%@1996,14%@2035,33%@2035,22%@1996,25%@1997,24%@1997] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 5%@2581
RAM 2490/6660MB (lfb 737x4MB) cpu [24%@1977,43%@2031,6%@2032,27%@1969,25%@1978,21%@1974] EMC 0%@1600 GR3D 5%@1275 GR3D_PCI 99%@2581
RAM 2490/6660MB (lfb 731x4MB) cpu [23%@1996,31%@2035,19%@2035,27%@1997,23%@1997,23%@1997] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 2490/6660MB (lfb 724x4MB) cpu [27%@1996,22%@2034,24%@2035,22%@1995,26%@1996,25%@1995] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 2490/6660MB (lfb 718x4MB) cpu [23%@1997,42%@2035,5%@2035,25%@1997,24%@1997,20%@1996] EMC 0%@1600 GR3D 3%@1275 GR3D_PCI 5%@2581
RAM 2491/6660MB (lfb 712x4MB) cpu [22%@1974,11%@2031,39%@2033,25%@1973,22%@1974,21%@1974] EMC 0%@1600 GR3D 6%@1275 GR3D_PCI 25%@2581
RAM 2491/6660MB (lfb 704x4MB) cpu [26%@1970,44%@2029,4%@2027,17%@1973,27%@1976,26%@1976] EMC 0%@1600 GR3D 6%@1275 GR3D_PCI 0%@2581
RAM 2491/6660MB (lfb 698x4MB) cpu [29%@1997,25%@2034,19%@2037,27%@1997,25%@1996,26%@1997] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 3%@2581
RAM 2491/6660MB (lfb 691x4MB) cpu [29%@1975,10%@2032,35%@2029,25%@1974,24%@1973,22%@1971] EMC 0%@1600 GR3D 2%@1275 GR3D_PCI 0%@2581
RAM 2491/6660MB (lfb 684x4MB) cpu [23%@1997,4%@2034,41%@2034,27%@1996,27%@1995,27%@1995] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 2%@2581
RAM 2491/6660MB (lfb 677x4MB) cpu [26%@1995,2%@2034,44%@2034,23%@1996,23%@1997,24%@1996] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2580
RAM 2492/6660MB (lfb 670x4MB) cpu [27%@1966,1%@2033,42%@2030,29%@1972,22%@1974,29%@1980] EMC 0%@1600 GR3D 9%@1275 GR3D_PCI 22%@2581
RAM 2492/6660MB (lfb 663x4MB) cpu [21%@1966,2%@2029,45%@2029,27%@1970,21%@1972,27%@1969] EMC 0%@1600 GR3D 6%@1275 GR3D_PCI 49%@2579
RAM 2493/6660MB (lfb 656x4MB) cpu [25%@1976,21%@2030,26%@2031,20%@1976,23%@1980,28%@1975] EMC 0%@1600 GR3D 5%@1275 GR3D_PCI 0%@2581
RAM 2493/6660MB (lfb 648x4MB) cpu [22%@1997,8%@2035,38%@2034,30%@1997,25%@1997,23%@1997] EMC 0%@1600 GR3D 9%@1275 GR3D_PCI 92%@2573
RAM 2469/6660MB (lfb 643x4MB) cpu [29%@1969,14%@2025,31%@2027,28%@1978,28%@1977,27%@1976] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 2469/6660MB (lfb 641x4MB) cpu [30%@1973,33%@2028,9%@2029,27%@1971,29%@1962,26%@1965] EMC 0%@1600 GR3D 8%@1275 GR3D_PCI 0%@2581
RAM 2468/6660MB (lfb 634x4MB) cpu [26%@1975,19%@2030,24%@2029,27%@1970,26%@1974,29%@1978] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 0%@2581
RAM 2469/6660MB (lfb 626x4MB) cpu [34%@1958,35%@2005,5%@2001,36%@1956,32%@1956,36%@1958] EMC 0%@1600 GR3D 0%@1275 GR3D_PCI 24%@2580