Links to CUDA development tools
Please refer to this page for a reasonably comprehensive list of development tools, libraries, plugins for GPU computing using CUDA-enabled GPUs:

[url="http://www.nvidia.com/object/tesla_software.html"]http://www.nvidia.com/object/tesla_software.html[/url]

If we missed something, please post it on this thread.
Please refer to this page for a reasonably comprehensive list of development tools, libraries, plugins for GPU computing using CUDA-enabled GPUs:



http://www.nvidia.com/object/tesla_software.html



If we missed something, please post it on this thread.

#1
Posted 03/16/2009 06:38 PM   
I think it would be useful to link to MisterAnderson42's GPUWorker class for spawning host threads to more easily manage multi-GPU programs. Unfortunately, it seems to be only referenced in the forum, and inside the source code for HOOMD.

Perhaps he could be persuaded to make a homepage for GPUWorker, and then you could link to that.
I think it would be useful to link to MisterAnderson42's GPUWorker class for spawning host threads to more easily manage multi-GPU programs. Unfortunately, it seems to be only referenced in the forum, and inside the source code for HOOMD.



Perhaps he could be persuaded to make a homepage for GPUWorker, and then you could link to that.

#2
Posted 03/17/2009 01:27 AM   
[url="http://code.google.com/p/komrade/"]Komrade: a pretty neat C++ library for CUDA with a very silly name[/url]

#3
Posted 04/07/2009 08:36 PM   
There is this debugging tool from the University of Oxford for viewing and comparing the contents of host and device memory.

[url="http://www.oerc.ox.ac.uk/research/many-core-and-reconfigurable-supercomputing/memviewer"]http://www.oerc.ox.ac.uk/research/many-cor...uting/memviewer[/url][img]http://www.oerc.ox.ac.uk/personal-pages/daniel/MemView-Image-5.png[/img]
There is this debugging tool from the University of Oxford for viewing and comparing the contents of host and device memory.



http://www.oerc.ox.ac.uk/research/many-cor...uting/memviewerImage

#4
Posted 06/04/2009 12:19 PM   
[quote name='worc1154' post='548755' date='Jun 4 2009, 08:19 AM']There is this debugging tool from the University of Oxford for viewing and comparing the contents of host and device memory.

I couldn't get the link provided to work.

Here is an updated url for the MemViewer tool: [url="http://www.oerc.ox.ac.uk/research/many-core-and-reconfigurable-supercomputing/memviewer"]http://www.oerc.ox.ac.uk/research/many-cor...uting/memviewer[/url]
[quote name='worc1154' post='548755' date='Jun 4 2009, 08:19 AM']There is this debugging tool from the University of Oxford for viewing and comparing the contents of host and device memory.



I couldn't get the link provided to work.



Here is an updated url for the MemViewer tool: http://www.oerc.ox.ac.uk/research/many-cor...uting/memviewer

#5
Posted 07/01/2009 02:47 PM   
Ocelot [url="http://code.google.com/p/gpuocelot/"]http://code.google.com/p/gpuocelot/[/url] is an alternative to deviceemu. It executes CUDA programs one instruction at a time as they would be on a GPU with a very large warp size.

It has built in memory checking functionality that will detect if you use a host pointer in device code or write to memory that has not been allocated.
Ocelot http://code.google.com/p/gpuocelot/ is an alternative to deviceemu. It executes CUDA programs one instruction at a time as they would be on a GPU with a very large warp size.



It has built in memory checking functionality that will detect if you use a host pointer in device code or write to memory that has not been allocated.

#6
Posted 07/23/2009 09:12 PM   
[url="http://developer.nvidia.com/object/agperfmon_home.html"]AgPerfMon[/url] is a tool mostly for PhysX and graphics programmers, but it also reveals some low level CUDA kernel scheduling. It records timestamps, SM, and Warp IDs of running kernels and shows them on a timeline.
AgPerfMon is a tool mostly for PhysX and graphics programmers, but it also reveals some low level CUDA kernel scheduling. It records timestamps, SM, and Warp IDs of running kernels and shows them on a timeline.

#7
Posted 02/02/2010 06:46 PM   
A plugin for Eclipse for CUDA and/or QT development/compilation:

[url="http://www.ai3.uni-bayreuth.de/software/eclipsecudaqt/index.php"]http://www.ai3.uni-bayreuth.de/software/ec...udaqt/index.php[/url]
A plugin for Eclipse for CUDA and/or QT development/compilation:



http://www.ai3.uni-bayreuth.de/software/ec...udaqt/index.php

#8
Posted 02/26/2010 12:08 PM   
There are a few more that you should add:

* Full support for [url="http://psilambda.com/2010/07/kappa-quick-start-guide-for-windows/"].Net[/url] (full CUDA driver API access and more) (C# and Visual Basic Examples)
* Full support for [url="http://psilambda.com/download/kappa-for-perl/"]Perl[/url] (full CUDA driver API access and more--see below)
* Full support for [url="http://psilambda.com/download/kappa-for-python/"]Python[/url] (full CUDA driver API access and more--see below)
* Full access for [url="http://psilambda.com/download/kappa-extras/"]Ruby[/url] to run CUDA via the CUDA driver API
* Full access for [url="http://psilambda.com/download/kappa-extras/"]Lua[/url] to run CUDA via the CUDA driver API

* [url="http://psilambda.com/download/"]Source code[/url] for all Kappa library language bindings and keywords are available using the Kappa library installers.

Performance is usually comparable to C++ since this is a high-level interface--most CUDA API operations such as memory management and transfer and other CUDA API operations are performed by the Kappa C++ library. (Performance can be better than any single CUDA C/C++ SDK example since all CUDA best practices, memory mapping plus concurrent kernel execution are the default if supported by the GPU hardware.) Full multi-GPU and CUDA JIT is available for all language bindings.

Since the Kappa library uses a producer/consumer data flow scheduler, defaults to asynchronous CUDA kernel launches, and supports asynchronous CPU kernel and SQL operations, it can achieve full occupancy of CPU and GPU. The CUDA kernel launches are such that, on GF100 GPUs, concurrent kernel execution is automatic and the usual mode. This assumes that the GPU has occupancy available for that mixture of kernels. Whether CUDA kernels can execute concurrently becomes a (potentially nondeterministic) result of the dynamics of execution of host and GPU code that should always meet or exceed performance otherwise available.

For .Net, you can create .Net subclass instances to tie to the Kappa IO keyword and to receive exception notifications. These subclasses execute on the host thread associated to the GPU context so that the full CUDA API is accessible for that GPU context.

For the Perl and Python mentioned above, developers can use a mixture of CUDA C++ running on the GPU, and C++ (including OpenMP), Perl, or Python running on the host as a single integrated processing task.

Additional language bindings (non-tested--no examples) are available for invoking CUDA via the Kappa library from: Java, R, PHP, Octave/Matlab, TCL, allegrocl, chicken, guile, mzscheme, ocaml, and pike.

The Kappa library is commercial but the .Net, Perl, Python, Lua, Ruby, etc modules/packages, examples, and keyword source code are available under the MIT License.
There are a few more that you should add:



* Full support for .Net (full CUDA driver API access and more) (C# and Visual Basic Examples)

* Full support for Perl (full CUDA driver API access and more--see below)

* Full support for Python (full CUDA driver API access and more--see below)

* Full access for Ruby to run CUDA via the CUDA driver API

* Full access for Lua to run CUDA via the CUDA driver API



* Source code for all Kappa library language bindings and keywords are available using the Kappa library installers.



Performance is usually comparable to C++ since this is a high-level interface--most CUDA API operations such as memory management and transfer and other CUDA API operations are performed by the Kappa C++ library. (Performance can be better than any single CUDA C/C++ SDK example since all CUDA best practices, memory mapping plus concurrent kernel execution are the default if supported by the GPU hardware.) Full multi-GPU and CUDA JIT is available for all language bindings.



Since the Kappa library uses a producer/consumer data flow scheduler, defaults to asynchronous CUDA kernel launches, and supports asynchronous CPU kernel and SQL operations, it can achieve full occupancy of CPU and GPU. The CUDA kernel launches are such that, on GF100 GPUs, concurrent kernel execution is automatic and the usual mode. This assumes that the GPU has occupancy available for that mixture of kernels. Whether CUDA kernels can execute concurrently becomes a (potentially nondeterministic) result of the dynamics of execution of host and GPU code that should always meet or exceed performance otherwise available.



For .Net, you can create .Net subclass instances to tie to the Kappa IO keyword and to receive exception notifications. These subclasses execute on the host thread associated to the GPU context so that the full CUDA API is accessible for that GPU context.



For the Perl and Python mentioned above, developers can use a mixture of CUDA C++ running on the GPU, and C++ (including OpenMP), Perl, or Python running on the host as a single integrated processing task.



Additional language bindings (non-tested--no examples) are available for invoking CUDA via the Kappa library from: Java, R, PHP, Octave/Matlab, TCL, allegrocl, chicken, guile, mzscheme, ocaml, and pike.



The Kappa library is commercial but the .Net, Perl, Python, Lua, Ruby, etc modules/packages, examples, and keyword source code are available under the MIT License.

#9
Posted 06/12/2010 09:29 PM   
[b][url="http://www.cuvilib.com"]CUVI Lib v0.3[/url][/b] (Beta version) is a new library from [b][url="http://www.TunaCode.com"]TunaCode[/url][/b]. You can download a copy from:

[url="http://www.cuvilib.com/downloads"][b]http://www.cuvilib.com/downloads[/b][/url]

CUVI Lib (CUDA for Vision and Imaging Lib) is an add-on library for NPP (NVIDIA Performance Primitives) and includes several advanced computer vision and image processing functions presently not available in NPP

In the current release of CUVI Lib you will find:

- Optical Flow (Horn & Shunck)
- Optical Flow (Lucas & Kanade)
- Discrete Wavelet Transform (Forward and Inverse)
- Hough Transform
- Hough Lines (Lines Detector)
- Color Conversion (RGB-to-gray and RGBA-to-Gray)

Several more advanced features will be added to CUVI Lib in upcoming releases. A detailed function reference can be downloaded from:
[url="http://www.cuvilib.com/cuvimanual.pdf"][b]www.cuvilib.com/cuvimanual.pdf[/b][/url]

We are looking forward to hearing your feedback and guidance on our forums ([url="http://www.cuvilib.com/forums"][b]http://www.cuvilib.com/forums[/b][/url]) and look forward to make CUVI Lib a single complete source of computer vision and image processing functions implemented on the GPU.
CUVI Lib v0.3 (Beta version) is a new library from TunaCode. You can download a copy from:



http://www.cuvilib.com/downloads



CUVI Lib (CUDA for Vision and Imaging Lib) is an add-on library for NPP (NVIDIA Performance Primitives) and includes several advanced computer vision and image processing functions presently not available in NPP



In the current release of CUVI Lib you will find:



- Optical Flow (Horn & Shunck)

- Optical Flow (Lucas & Kanade)

- Discrete Wavelet Transform (Forward and Inverse)

- Hough Transform

- Hough Lines (Lines Detector)

- Color Conversion (RGB-to-gray and RGBA-to-Gray)



Several more advanced features will be added to CUVI Lib in upcoming releases. A detailed function reference can be downloaded from:

www.cuvilib.com/cuvimanual.pdf



We are looking forward to hearing your feedback and guidance on our forums (http://www.cuvilib.com/forums) and look forward to make CUVI Lib a single complete source of computer vision and image processing functions implemented on the GPU.

Best,

Salman Ul Haq,

TunaCode Pvt. Limited

Developers of GPU based Computing solutions

(CUDA for Vision and Imaging Library)

W: www.cuvilib.com

E: salman@tunacode.com



GPU: NVIDIA Tesla C2050

CPU: Core2Duo 2.33GHz 2.5GB DDR2 RAM

CUDA Toolkit v4.0RC

#10
Posted 08/01/2010 02:18 PM   
[quote name='daniel.s' post='1008398' date='Feb 26 2010, 05:08 PM']A plugin for Eclipse for CUDA and/or QT development/compilation:

[url="http://www.ai3.uni-bayreuth.de/software/eclipsecudaqt/index.php"]http://www.ai3.uni-bayreuth.de/software/ec...udaqt/index.php[/url][/quote]

How does the binding work on it?
[quote name='daniel.s' post='1008398' date='Feb 26 2010, 05:08 PM']A plugin for Eclipse for CUDA and/or QT development/compilation:



http://www.ai3.uni-bayreuth.de/software/ec...udaqt/index.php



How does the binding work on it?

Best,

Salman Ul Haq,

TunaCode Pvt. Limited

Developers of GPU based Computing solutions

(CUDA for Vision and Imaging Library)

W: www.cuvilib.com

E: salman@tunacode.com



GPU: NVIDIA Tesla C2050

CPU: Core2Duo 2.33GHz 2.5GB DDR2 RAM

CUDA Toolkit v4.0RC

#11
Posted 08/01/2010 03:01 PM   
[quote name='sumitg' post='519037' date='Mar 16 2009, 02:38 PM']Please refer to this page for a reasonably comprehensive list of development tools, libraries, plugins for GPU computing using CUDA-enabled GPUs:

[url="http://www.nvidia.com/object/tesla_software.html"]http://www.nvidia.com/object/tesla_software.html[/url]

If we missed something, please post it on this thread.[/quote]

Links to the CUDA 32-bit and 64-bit toolkits do not work: result is a nearly blank page with File Not Found message.
[quote name='sumitg' post='519037' date='Mar 16 2009, 02:38 PM']Please refer to this page for a reasonably comprehensive list of development tools, libraries, plugins for GPU computing using CUDA-enabled GPUs:



http://www.nvidia.com/object/tesla_software.html



If we missed something, please post it on this thread.



Links to the CUDA 32-bit and 64-bit toolkits do not work: result is a nearly blank page with File Not Found message.

#12
Posted 09/17/2010 09:59 PM   
[quote name='sumitg' post='519037' date='Mar 16 2009, 02:38 PM']Please refer to this page for a reasonably comprehensive list of development tools, libraries, plugins for GPU computing using CUDA-enabled GPUs:

[url="http://www.nvidia.com/object/tesla_software.html"]http://www.nvidia.com/object/tesla_software.html[/url]

If we missed something, please post it on this thread.[/quote]

Links to the CUDA 32-bit and 64-bit toolkits do not work: result is a nearly blank page with File Not Found message.
[quote name='sumitg' post='519037' date='Mar 16 2009, 02:38 PM']Please refer to this page for a reasonably comprehensive list of development tools, libraries, plugins for GPU computing using CUDA-enabled GPUs:



http://www.nvidia.com/object/tesla_software.html



If we missed something, please post it on this thread.



Links to the CUDA 32-bit and 64-bit toolkits do not work: result is a nearly blank page with File Not Found message.

#13
Posted 09/17/2010 09:59 PM   
[quote name='sumitg' date='17 March 2009 - 02:38 AM' timestamp='1237228734' post='519037']
Please refer to this page for a reasonably comprehensive list of development tools, libraries, plugins for GPU computing using CUDA-enabled GPUs:

[url="http://www.nvidia.com/object/tesla_software.html"]http://www.nvidia.com/object/tesla_software.html[/url]

If we missed something, please post it on this thread.
[/quote]
An open source project, SGC Ruby CUDA, is made available at http://github.com/xman/sgc-ruby-cuda and the Ruby standard Gems repository.
It provides accesses to CUDA API in a Ruby program.
[quote name='sumitg' date='17 March 2009 - 02:38 AM' timestamp='1237228734' post='519037']

Please refer to this page for a reasonably comprehensive list of development tools, libraries, plugins for GPU computing using CUDA-enabled GPUs:



http://www.nvidia.com/object/tesla_software.html



If we missed something, please post it on this thread.



An open source project, SGC Ruby CUDA, is made available at http://github.com/xman/sgc-ruby-cuda and the Ruby standard Gems repository.

It provides accesses to CUDA API in a Ruby program.

SpeedGo Computing

HPC Software Engineer

Shin Yee

#14
Posted 05/15/2011 12:46 AM   
CUDA Eclipse plugin: http://ydl.net/eclipse_cuda_plugin/
Yellow Dog Linux, tailored for CUDA development: http://ydl.net/products/ydl/
CUDA Eclipse plugin: http://ydl.net/eclipse_cuda_plugin/

Yellow Dog Linux, tailored for CUDA development: http://ydl.net/products/ydl/

#15
Posted 09/23/2011 09:36 PM   
Scroll To Top