GPGPU vs. Grid Computing with CPUs What advantages does GPGPU have?

I am trying to come up with a general (and probably imperfect but still useful) way of summarizing the advantages of each type of architecture.

I am more interested in small to medium sized grids, not supercomputer grids like the ones running on the Top 500 list.

I guess I am wondering why would you chose the GPU over a small cluster or grid of traditional general purpose CPU/memory nodes?

I am thinking that part of the answer is that the cost to build enough nodes that would meet the performance of say a dual 8800 card solution would greatly exceed the cost of the 2 GPUs.

I am not aware of a single grid in the Top 500.
Are you talking about grids (in the sense of Globus, Condor) or about clusters?
They are quite different things.

You also need to consider that you could have GPU in clusters or grids.

You can’t really compare the two, since grid isn’t really an architecture, to be precise. A grid is just a framework to get a number of compute nodes to collaborate on a problem. Nodes may even be very different architecturally. Usually communication cost is very high, so problems amenable to grids tend to be in the embarassingly parallel class (parallelization is very coarse grained, meaning that a node can compute independently for a long time before needing to communicate).
GPUs are amenable to parallelizations at a much finer grain, thus applicable to a wider range of problems. And, as Massimiliano pointed out, nodes within clusters or grids may very well use GPUs for computation.

Paulius

Currently the GPU Computing frameworks do not support
Double Precision numbers, while grids with all non acient CPUs
do.

Apart from this the way of making a problem solution
parallel is almost identical for Grids and GPUs.
Grids have a tremendous loss of Performance
on inter CPU communication, GPUs are not that versatile.

A grid begins with one CPU.
Before adding one or more new CPUs
to a Grid, I would first add a GPU and try it with this.
I noticed a Performance increase of about 1000%
using a GPU even for common problems like searching
and sorting plus the CPU is free for other things.

When building up a huge grid, additional (monetary) efforts
have to be made per CPU for the inter CPU communication,
which are probably not less than a 8800 per CPU.

I wonder -since the GPU has DMA access memory, disks and Networkcards-
whether the GPU can handle the inter CPU communication.

Note: I’m assuming you are asking about comparing clusters to GPUs. A cluster is a set of compute nodes connected by a high-bandwidth, low latency interconnect like infiniband or myrinet. These typically have bandwidths of 10Gbps at latencies in the microsecond range. Clusters are for running one single calculation at a speed much faster than a single node can do.

Grids of embarisingly parallel tasks such as Seti@home or folding@home are a completely different topic. They compute many hundreds of independent calculations.

The short answer to the trade offs between GPUs and clusters is that it depends on the application. Here is a breakdown for my application, molecular dynamics (MD).

On a cluster, MD is breaks up the problem across a number of nodes. Each node updates a small set of particles and then communicates with it’s neighboring nodes. This process intercommunication occurs quite often (up to 700 times per second in my systems). So, in MD there is a fairly high fraction of communication to processing time. As more nodes are used in a computation, the performance increases, but the communication overhead increases as well. For example, I typically run simulations on 64 or 128 processors and the comm overhead is 40%. If I use any more, the point of diminishing returns is reached, and the overall performance does not increase. Even with this, I’m still waiting days for sims to run.

So, for my application, the biggest disadvantage of running on a cluster is the communication efficiency. To get a single job done in a reasonable time, I need to use almost twice as many compute nodes as needed. Other disadvantages of clusters include administration, cost, and downtime. The advantages are that clusters have been around long enough that there plenty of places to buy them from and most of the scientific software out there supports them. So, one can get working software up and running quickly.

Running on a single GPU, the biggest advantage over the cluster would be the lack of the communication overhead => more efficient calculations. That, and cost, with a $500 GPU potentially having the performance of a 32 node cluster in my application.

Disadvantages of the GPU: 1) High development time. There isn’t a well established set of software out there for doing calculations on the GPU yet. 2) Lack of double precision math (though this will be changing in the near future) 3) Not every type of calculation is well suited to be done on the GPU

Of course, I one day hope to take the good with the bad and build a cluster of GPUs for some insane performance :) Though I half expect the communication overhead to be so huge that it isn’t worthwhile. I’ll need to do some testing first.