Correct compute architecture for TX2 and OpenCV4Tegra compute architecture

mleonhardt · March 22, 2017, 8:42pm

I am attempting to use the GPU module in the OpenCV4Tegra that was installed when I flashed my brand new Tx2 with Jetpack 3.0. I am getting a number of errors that seem to indicate that OpenCV4Tegra was not compiled with the correct GPU architecture. My understanding is the Tx2 should be compiled with: compute_62 and sm_62.

Error:
OpenCV Error: Gpu API call (invalid device function) in splitC2_, file /hdd/buildbot/slave_jetson_tx_3/35-O4T-L4T-R24/opencv/modules/gpu/src/cuda/split_merge.cu, line 448
terminate called after throwing an instance of ‘cv::Exception’
what(): /hdd/buildbot/slave_jetson_tx_3/35-O4T-L4T-R24/opencv/modules/gpu/src/cuda/split_merge.cu:448: error: (-217) invalid device function in function splitC2_

To check this, I wrote a simple programm that just called the following:
std::cout << "OpenCV has 6.2: " << cv::gpu::TargetArchs::has(6, 2) << “\n”;
std::cout << "OpenCV hasPTX 6.2: " << cv::gpu::TargetArchs::hasPtx(6, 2) << “\n”;
std::cout << "OpenCV hasBin 6.2: " << cv::gpu::TargetArchs::hasBin(6, 2) << “\n”;
std::cout << "OpenCV has 5.3: " << cv::gpu::TargetArchs::has(5, 3) << “\n”;
std::cout << "OpenCV hasPTX 5.3: " << cv::gpu::TargetArchs::hasPtx(5, 3) << “\n”;
std::cout << "OpenCV hasBin 5.3: " << cv::gpu::TargetArchs::hasBin(5, 3) << “\n”;

The results in the following output:
OpenCV has 6.2: 0
OpenCV hasPTX 6.2: 0
OpenCV hasBin 6.2: 0
OpenCV has 5.3: 1
OpenCV hasPTX 5.3: 0
OpenCV hasBin 5.3: 1

This leads me to three questions:

What is the correct compute architecture for the TX2?
How do I make sure that OpenCV4Tegra has the correct compute architecture for the TX2?
Why might Jetpack 3.0 install the incorrect architecture (possibly the TX1 version as this has compute_53)?

AastaLLL · March 23, 2017, 6:57am

Hi,

Thanks for your question.

We are sorry about this issue and works on this currently.
Will update information to you later.

Sorry for the inconvenience.

AastaLLL · March 23, 2017, 8:26am

Hi,

Temporal solution is to build opencv directly from source.

Following topic can give you some hint on compiling opencv:
https://devtalk.nvidia.com/default/topic/983098/jetson-tx1/opencv-3-1-with-usb-camera-support/

Thanks.

mleonhardt · March 23, 2017, 2:40pm

Do you have an estimated time frame on when Nvidia might provide an updated OpenCV4Tegra package? I would rather not port all of my code to OpenCV 3.1 (due to the GPU → Cuda namespace change) if I don’t have to.

Also, does the public OpenCV 2.4 branch have the latest CUDA 8.0 bug fixes that are referenced in your link?

Finally, just to verify, the TX2 requres sm_62 and compute_62, correct?

Thanks,

Mark

AastaLLL · March 24, 2017, 2:05am

Hi Mark,

Thanks for your question, TX2 require 62 architecture.
Other details we are verifying internally and will update to you soon.

Thanks for your patient and sorry for the inconvenience.

AastaLLL · March 24, 2017, 2:30am

Hi Mark,

If you want to build opencv2.4.13, the version opencv4Tegra used, please follow this page:

[url]OpenCV: Building OpenCV for Tegra with CUDA

Thanks.

andreaaa · June 29, 2017, 1:32pm

Hi,

I am using OpenCV4Tegra on a TX2 and I have a similar problem. Indeed when I try to use some simple cuda functions, like cv::gpu::add, I get this error:

OpenCV Error: Gpu API call (invalid device function) in call, file /hdd/buildbot/slave_jetson_tx_3/35-O4T-L4T-R24/opencv/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp, line 386
terminate called after throwing an instance of 'cv::Exception' what():  /hdd/buildbot/slave_jetson_tx_3/35-O4T-L4T-R24/opencv/modules/gpu/include/opencv2/gpu/device/detail/transform_detail.hpp:386: error: (-217) invalid device function in function call

If I read the topic I understand that I should compile opencv from source and set the variable -DCUDA_ARCH_BIN=6.2. Is this correct?

Thanks in advance.

AastaLLL · June 30, 2017, 5:27am

Hi andreaaa,

Yes, since the GPU architecture in opencv4tegra is not correct.
Currently, please build OpenCV from source to fix the GPU architecture.

For OpenCV 3.x, please follow this page:
http://dev.t7.ai/jetson/opencv/

For Opencv-2.4.13, here is a script:
[url]https://devtalk.nvidia.com/default/topic/1000106/jetson-tx2/opencv-convertto-failure/post/5171055/#5171055[/url]

We are sorry for the inconvenience.

MilindDeore · July 22, 2017, 2:08am

I am also facing the same issue, will this be fixed as part of new JetPack3.1 release planned next week? Thanks for the information.

AastaLLL · July 24, 2017, 1:59am

YES, we fix this issue in JetPack3.1.
Sorry for keeping you all waiting.

AastaLLL · July 25, 2017, 1:48am

Hi,

Correct Opencv4Tegra is available now.
Please check JetPack SDK | NVIDIA Developer

JaTxGPU · July 31, 2017, 7:30am

I’d the same issue, do I need to flash my TX2 or just re-install Opencv4Tegra?
Now I’ve OpenCV 3. the standard version (I had Opencv4Tegra before)

AastaLLL · August 1, 2017, 2:09am

Hi,

Please reflash the device. We built OpenCV4Tegra with the rel-28 branch.

Thanks.