slow performance of clGetPlatformIDs
Hello,

I have previously used CUDA for some simulations with evolutionary algorithms and it worked great. However now I need that simulation to run on multiple devices (ATI and Nvidia) so I decided to rewrite the application with OpenCL.
So far so good, application works fine, but is unbearably slow. With some experimenting I identified the problem. Function [i]clGetPlatformIDs[/i] takes more than 13 seconds. This is on a linux (CentOS 6.2 64bit machine, 2.6.32-220.2.1.el6.x86_64 kernel) on windows (7 64bit) it is much, much faster (60-200ms).

The actual computation speed is not a problem, just the initialization...Has anybody met with this strange behavior_? Any hint would be helpful...

Some outputs from [i]clGetPlatformInfo[/i] and [i]clGetDeviceInfo[/i]:

=== 1 OpenCL platform(s) found: ===
PROFILE = FULL_PROFILE
VERSION = OpenCL 1.1 CUDA 4.1.1
NAME = NVIDIA CUDA
VENDOR = NVIDIA Corporation
EXTENSIONS = cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
=====================================


=== 1 Device(s) found ===
Device nameTesla C2075
Device versionOpenCL 1.1 CUDA
=====================================

M.
Hello,



I have previously used CUDA for some simulations with evolutionary algorithms and it worked great. However now I need that simulation to run on multiple devices (ATI and Nvidia) so I decided to rewrite the application with OpenCL.

So far so good, application works fine, but is unbearably slow. With some experimenting I identified the problem. Function clGetPlatformIDs takes more than 13 seconds. This is on a linux (CentOS 6.2 64bit machine, 2.6.32-220.2.1.el6.x86_64 kernel) on windows (7 64bit) it is much, much faster (60-200ms).



The actual computation speed is not a problem, just the initialization...Has anybody met with this strange behavior_? Any hint would be helpful...



Some outputs from clGetPlatformInfo and clGetDeviceInfo:



=== 1 OpenCL platform(s) found: ===

PROFILE = FULL_PROFILE

VERSION = OpenCL 1.1 CUDA 4.1.1

NAME = NVIDIA CUDA

VENDOR = NVIDIA Corporation

EXTENSIONS = cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll

=====================================





=== 1 Device(s) found ===

Device nameTesla C2075

Device versionOpenCL 1.1 CUDA

=====================================



M.

#1
Posted 05/03/2012 11:43 AM   
Scroll To Top