We all know that the most serious weakness of GPU processing is flow control. Thus, problems in graph theory are not delicious for CUDA. I raise here one of the most basic problems: marking connected nodes in a graph.

The problem is quite simple in the abstract: Given a graph of N nodes and E arcs, each of which connects two different nodes; let's mark all nodes that are directly/indirectly connected to a certain node.

People have tackled with such problems for classic parallel-processing systems. A good example is here: [url="http://computation.pa.msu.edu/NO/ConnCompPresentation.html"]http://computation.pa.msu.edu/NO/ConnCompPresentation.html[/url]. Now comes the CUDA turn.

It is very straightforward to generate testing data. Judging candidates is quite objective: the fastest takes all.

We all know that the most serious weakness of GPU processing is flow control. Thus, problems in graph theory are not delicious for CUDA. I raise here one of the most basic problems: marking connected nodes in a graph.

The problem is quite simple in the abstract: Given a graph of N nodes and E arcs, each of which connects two different nodes; let's mark all nodes that are directly/indirectly connected to a certain node.

The problem is quite simple in the abstract: Given a graph of N nodes and E arcs, each of which connects two different nodes; let's mark all nodes that are directly/indirectly connected to a certain node.

People have tackled with such problems for classic parallel-processing systems. A good example is here: [url="http://computation.pa.msu.edu/NO/ConnCompPresentation.html"]http://computation.pa.msu.edu/NO/ConnCompPresentation.html[/url]. Now comes the CUDA turn.

It is very straightforward to generate testing data. Judging candidates is quite objective: the fastest takes all.

The problem is quite simple in the abstract: Given a graph of N nodes and E arcs, each of which connects two different nodes; let's mark all nodes that are directly/indirectly connected to a certain node.

People have tackled with such problems for classic parallel-processing systems. A good example is here: http://computation.pa.msu.edu/NO/ConnCompPresentation.html. Now comes the CUDA turn.

It is very straightforward to generate testing data. Judging candidates is quite objective: the fastest takes all.