GPU vs. CPU Comparison over the last years

We prepared a comparison sheet of theoretical performance (number of transistors, peak performance, memory bandwidth) of CPUs and GPUs over the last 6 years.

Since it was quite an effort to collect all the data, we would like to share our results with you.

Maybe you have some comments to extend or correct it …

Year || GPU-Type				  | Trans.	|  Peak Perf. | Memory Band. || CPU-Type					 | Trans.   | Peak Perf. | Memory Band.

-----++---------------------------+-----------+-------------+--------------++------------------------------+----------+------------+-------------

2004 || NVIDIA GeForce 6800 Ultra |  222 Mio. |   53 GFLOPS |  35.2 GB/sec || Intel Pentium 4 570 (3.8GHz) | 125 Mio. |  15 GFLOPS |   6.4 GB/sec

2005 || NVIDIA GeForce 7800 GTX   |  302 Mio. |  165 GFLOPS |  51.2 GB/sec || AMD Athlon X2 4800+		  | 233 Mio. |  19 GFLOPS |   6.4 GB/sec

2006 || NVIDIA GeForce 8800 GTX   |  681 Mio. |  518 GFLOPS |  86.4 GB/sec || Intel Core 2 Extreme QX6700  | 582 Mio. |  85 GFLOPS |  12.8 GB/sec

2007 || NVIDIA Geforce 8800 Ultra |  681 Mio. |  576 GFLOPS | 103.7 GB/sec || Intel Core 2 Extreme QX9650  | 820 Mio. |  96 GFLOPS |  21.3 GB/sec

2008 || NVIDIA GeForce GTX 280	| 1400 Mio. |  933 GFLOPS | 113.3 GB/sec || Intel Core i7-965 XE		 | 731 Mio. | 102 GFLOPS |  25.6 GB/sec

2009 || AMD Radeon HD 5870		| 2150 Mio. | 2720 GFLOPS | 153.6 GB/sec || Intel Core i7-975 XE		 | 731 Mio. | 106 GFLOPS |  25.6 GB/sec

These kinds of lists are always useful! I always keep trying to compare different cards, often to see if some cheapo old card can still perform fast enough for an algorithm or if I need to tell a client to update their GPU.

There’s two more detailed lists here for NVIDIA and ATI cards.

Can you quote where you got those Intel CPU numbers? 106GFLOPs for the i7-975 is twice as large as on the intel site for example.

http://www.intel.com/support/processors/sb/cs-023143.htm

Hi,

attached are a table of HW from both AMD/ATI, Intel, and Nvidia, some numbers are still missing. There is also a graph that looks at the peak GFLOPs developments since 2000.

Most of the information can be found here:

[url=“List of Nvidia graphics processing units - Wikipedia”]http://en.wikipedia.org/wiki/Comparison_of...rocessing_units[/url]
[url=“Penryn (microprocessor) - Wikipedia”]http://en.wikipedia.org/wiki/Penryn_(microprocessor)[/url]
[url=“Support for Intel® Processors”]Support for Intel® Processors
[url=“Wolfdale (microprocessor) - Wikipedia”]http://en.wikipedia.org/wiki/Wolfdale_(microprocessor)[/url]
[url=“List of Intel Core i7 processors - Wikipedia”]http://en.wikipedia.org/wiki/List_of_Intel...microprocessors[/url]
[url=“http://www.hpcwire.com/specialfeatures/sc09/top/Intel-CTO-Tells-HPC-Crowd-to-Get-a-Second-Life-70347992.html”]http://www.hpcwire.com/specialfeatures/sc0...e-70347992.html[/url]. 800 GFLOPS was reached for matrix-matrix multiplication; for sparce matrices, only 8 GFLOPS was reached.
[url=“Cell (microprocessor) - Wikipedia”]http://en.wikipedia.org/wiki/Cell_(microprocessor)[/url]

BTW, the table isn’t complete yet and the graph doesn’t include any chips with double GPUs ( such as 295 etc )…

Cheers!
j


tableHW.bmp (1.98 MB)

It surprised me how slow Intel CPUs are relative to GPUs. Another really surprising figure is for a Playstation 3 CPU - 218GFLOPS.

[url=“Playstation 3 (PS3) Release Date, Details, and Specs”]http://playstation.about.com/od/ps3/a/PS3SpecsDetails_3.htm[/url]

THe GPU is also surprisingly powerful for a several year old design. I believe it was loosely based on a suped-up version of the NVidia GTX7800. Some talk of the GTX380 coming in at around 3TFLOPS.

Thanks for this post…

I love this kind of data! ;)

FYI - Your data is welcome by others… External Image

[url=“http://www.evga.com/forums/tm.aspx?m=113941”]http://www.evga.com/forums/tm.aspx?m=113941[/url]

I don’t believe the 1,8 TFLOPS of the GPU. It was clocked at 550 MHz, had 300 million transistors, 24 pixel shader units and 8 vertex shader units. Its closest relative on the PC, the 7900 GTX, had the same number of shaders, roughly as many transistors, was clocked higher (at 650MHz) and achieved 280 GFLOPS. This would be a much closer estimate of its theoretical peak. It didn’t have unified shaders yet BTW, so programmability was questionable.

It’s CPU isn’t that strong either. Each SPU gives a theoretical peak of 25,6 GFLOPS, yielding about 150 GFLOPS (6 active SPUs in PS3) in single precision. The SPUs also rate at about 10,8 GFLOPS in double precision in total (6 * 1,8 GFLOPS). Additionally, the PPU gives 25,6 GFLOPS in single and 6,4 in double precision.

Double precision performance has been increased considerably (to 50% of SP) in a newer revisions but I’m not sure if Playstations received those.

You’re referring to the PowerXCell 8i chip, which is only used in IBM blade servers and PCI-Express coprocessor boards (which are several thousand dollars each). Those chips do not go into PS3s.

What you say does seem to make some sense and I was kind of hoping for an answer like this as the PS3’s performance simply doesn’t make sense. If an Nvidia-based GPU was capable of 1.8TFLOPS in 2006, then how come the latest GTX295 struggles to reach that in 2009 and at a cost that’s more than an entire PS3? Doesn’t seem to make sense.

And looking at these, I’m struggling to see where the extra performance comes from. But how can they get away with a claim that’s braisenly false all the same?

http://en.wikipedia.org/wiki/RSX_%27Reality_Synthesizer%27

http://en.wikipedia.org/wiki/GeForce_7_Ser…rce_7800_Series

It’s probably a case of misquoting someone and it ending all over the internet.