GTX 460
  1 / 4    
Any CUDA or OpenCL information and benchmarks on this new product.

From
[url="http://www.anandtech.com/show/3809/nvidias-geforce-gtx-460-the-200-king/3"]http://www.anandtech.com/show/3809/nvidias...-the-200-king/3[/url]

ECC is gone.
Double precision is 1/6th of the FP32 performance which is better than the 1/8th performance on the GTX470/480.

How will the super-scaler execution affect the CUDA and Opencl compilers.
Any CUDA or OpenCL information and benchmarks on this new product.



From

http://www.anandtech.com/show/3809/nvidias...-the-200-king/3



ECC is gone.

Double precision is 1/6th of the FP32 performance which is better than the 1/8th performance on the GTX470/480.



How will the super-scaler execution affect the CUDA and Opencl compilers.

#1
Posted 07/12/2010 06:44 AM   
[url="http://en.wikipedia.org/wiki/GeForce_400_Series"]http://en.wikipedia.org/wiki/GeForce_400_Series[/url]

wiki says it has FP32 running at 1361GFLOPS. Then FP64 is 226.83GFLOPS. That's way higher than 470's 136GFLOPS.

So as a 470 owner, I think I was royally screwed. /cold.gif' class='bbc_emoticon' alt=':shiver:' />
http://en.wikipedia.org/wiki/GeForce_400_Series



wiki says it has FP32 running at 1361GFLOPS. Then FP64 is 226.83GFLOPS. That's way higher than 470's 136GFLOPS.



So as a 470 owner, I think I was royally screwed. /cold.gif' class='bbc_emoticon' alt=':shiver:' />

#2
Posted 07/12/2010 07:23 AM   
Looking good. Now I just need to find somewhere selling this 2GB variant from Sparkle:

[url="http://www.sparkle.com.tw/News/SP460/news_SP460_en.html"]http://www.sparkle.com.tw/News/SP460/news_SP460_en.html[/url]
Looking good. Now I just need to find somewhere selling this 2GB variant from Sparkle:



http://www.sparkle.com.tw/News/SP460/news_SP460_en.html

#3
Posted 07/12/2010 08:53 AM   
[quote name='moozoo' post='1086568' date='Jul 12 2010, 02:44 PM']Any CUDA or OpenCL information and benchmarks on this new product.

From
[url="http://www.anandtech.com/show/3809/nvidias-geforce-gtx-460-the-200-king/3"]http://www.anandtech.com/show/3809/nvidias...-the-200-king/3[/url]

ECC is gone.
Double precision is 1/6th of the FP32 performance which is better than the 1/8th performance on the GTX470/480.

How will the super-scaler execution affect the CUDA and Opencl compilers.[/quote]

[url="http://www.anandtech.com/show/3809/nvidias-geforce-gtx-460-the-200-king"]http://www.anandtech.com/show/3809/nvidias...60-the-200-king[/url]

Hmm on the first page it says 1/12 FP32 which doesn't match

"but the effective execution rate of 1/6th FP32 performance will be enough to effectively program in FP64 and debug as necessary." on the 3rd page.

From
[url="http://www.legitreviews.com/article/1360/1/"]http://www.legitreviews.com/article/1360/1/[/url]

L2 Cache size cut to 386KB (768MB version) and 512KB (1GB version)
[quote name='moozoo' post='1086568' date='Jul 12 2010, 02:44 PM']Any CUDA or OpenCL information and benchmarks on this new product.



From

http://www.anandtech.com/show/3809/nvidias...-the-200-king/3



ECC is gone.

Double precision is 1/6th of the FP32 performance which is better than the 1/8th performance on the GTX470/480.



How will the super-scaler execution affect the CUDA and Opencl compilers.



http://www.anandtech.com/show/3809/nvidias...60-the-200-king



Hmm on the first page it says 1/12 FP32 which doesn't match



"but the effective execution rate of 1/6th FP32 performance will be enough to effectively program in FP64 and debug as necessary." on the 3rd page.



From

http://www.legitreviews.com/article/1360/1/



L2 Cache size cut to 386KB (768MB version) and 512KB (1GB version)

#4
Posted 07/12/2010 10:23 AM   
No CUDA benchmarks online yet, but we can predict from SP count and MHz pretty well.

What makes me have a BIG SMILE is that power use is only about 150 watts under load, and temps are roughly 65 degrees C, not 90C.
This is very very exciting because it means that a two GF104 chip card is within practical power and heat limits. I am now camping in line at the local NVIDIA store for the now inevitable GTX495, please!
No CUDA benchmarks online yet, but we can predict from SP count and MHz pretty well.



What makes me have a BIG SMILE is that power use is only about 150 watts under load, and temps are roughly 65 degrees C, not 90C.

This is very very exciting because it means that a two GF104 chip card is within practical power and heat limits. I am now camping in line at the local NVIDIA store for the now inevitable GTX495, please!

#5
Posted 07/12/2010 02:52 PM   
Take a look at this article. [url="http://www.anandtech.com/show/3809/nvidias-geforce-gtx-460-the-200-king"]http://www.anandtech.com/show/3809/nvidias...60-the-200-king[/url]

Seems like they added the ability to issue multiple instructions per-cycle from the same warp, and also added in an extra set of functional units per SM.
Take a look at this article. http://www.anandtech.com/show/3809/nvidias...60-the-200-king



Seems like they added the ability to issue multiple instructions per-cycle from the same warp, and also added in an extra set of functional units per SM.

#6
Posted 07/12/2010 04:35 PM   
Tech Report, AnandTech, FiringSquad, Guru3D, Hardware Canucks, Hardware Heaven, [H]ard|OCP,
Hexus.net, HotHardware, and PC Perspective all have GTX460 reviews.

None of them do a single CUDA test, not even Folding@Home. Sigh.

We'll just have to do our own experiments!
Tech Report, AnandTech, FiringSquad, Guru3D, Hardware Canucks, Hardware Heaven, [H]ard|OCP,

Hexus.net, HotHardware, and PC Perspective all have GTX460 reviews.



None of them do a single CUDA test, not even Folding@Home. Sigh.



We'll just have to do our own experiments!

#7
Posted 07/12/2010 06:38 PM   
[quote name='SPWorley' post='1086854' date='Jul 12 2010, 12:38 PM']Tech Report, AnandTech, FiringSquad, Guru3D, Hardware Canucks, Hardware Heaven, [H]ard|OCP,
Hexus.net, HotHardware, and PC Perspective all have GTX460 reviews.

None of them do a single CUDA test, not even Folding@Home. Sigh.

We'll just have to do our own experiments![/quote]

A prepackaged CUDA benchmark suite for Windows would be nice to push toward these reviewers. Any options?
[quote name='SPWorley' post='1086854' date='Jul 12 2010, 12:38 PM']Tech Report, AnandTech, FiringSquad, Guru3D, Hardware Canucks, Hardware Heaven, [H]ard|OCP,

Hexus.net, HotHardware, and PC Perspective all have GTX460 reviews.



None of them do a single CUDA test, not even Folding@Home. Sigh.



We'll just have to do our own experiments!



A prepackaged CUDA benchmark suite for Windows would be nice to push toward these reviewers. Any options?

#8
Posted 07/12/2010 07:05 PM   
So with the 50% extra cores we can expect it to perform like a GF100 with 336 SP's sometimes when the ILP is good? And in the worst case scenario it would be performing like it had 224 SP's ?

I guess that's hard to say but it's also noteworhty that they didn't increase the on-chip memory resources accordingly either.
So with the 50% extra cores we can expect it to perform like a GF100 with 336 SP's sometimes when the ILP is good? And in the worst case scenario it would be performing like it had 224 SP's ?



I guess that's hard to say but it's also noteworhty that they didn't increase the on-chip memory resources accordingly either.

#9
Posted 07/12/2010 09:55 PM   
[quote name='Jimmy Pettersson' post='1086996' date='Jul 12 2010, 03:55 PM']So with the 50% extra cores we can expect it to perform like a GF100 with 336 SP's sometimes when the ILP is good? And in the worst case scenario it would be performing like it had 224 SP's ?[/quote]

Plus the doubling of special function units, which will improve code that depends on those heavily.
[quote name='Jimmy Pettersson' post='1086996' date='Jul 12 2010, 03:55 PM']So with the 50% extra cores we can expect it to perform like a GF100 with 336 SP's sometimes when the ILP is good? And in the worst case scenario it would be performing like it had 224 SP's ?



Plus the doubling of special function units, which will improve code that depends on those heavily.

#10
Posted 07/12/2010 10:27 PM   
It is more marketing there I bet, lets see.
It is more marketing there I bet, lets see.

#11
Posted 07/13/2010 08:19 AM   
And another thing, double (64 bit floats) is only on 1 of each 3 blocks of cores. Does this impact programming (that is, do you have to include code to allow for this), or does the cuda runtime automatically take care of it?

Together with the artificial crippling of the 64 bit engine, anything needing 64 bit floating math will probably run better on the CPU - or will have to be converted to fixed point maths - assuming one doesn't have a tesla unit lurking out of earshot.
And another thing, double (64 bit floats) is only on 1 of each 3 blocks of cores. Does this impact programming (that is, do you have to include code to allow for this), or does the cuda runtime automatically take care of it?



Together with the artificial crippling of the 64 bit engine, anything needing 64 bit floating math will probably run better on the CPU - or will have to be converted to fixed point maths - assuming one doesn't have a tesla unit lurking out of earshot.

#12
Posted 07/15/2010 11:11 AM   
If anyone else in this thread has trouble getting a GTX 460 to work under Windows XP Prof. 64 bit, please let me know.
If anyone else in this thread has trouble getting a GTX 460 to work under Windows XP Prof. 64 bit, please let me know.

#13
Posted 07/15/2010 12:39 PM   
[quote name='pootle1' post='1088372' date='Jul 15 2010, 04:11 AM']And another thing, double (64 bit floats) is only on 1 of each 3 blocks of cores. Does this impact programming (that is, do you have to include code to allow for this), or does the cuda runtime automatically take care of it?

Together with the artificial crippling of the 64 bit engine, anything needing 64 bit floating math will probably run better on the CPU - or will have to be converted to fixed point maths - assuming one doesn't have a tesla unit lurking out of earshot.[/quote]
Isn't every GF100 100+ GFlops peak in DP? And maybe GF104 is 100+ GFlops too? That's higher than any CPU you can get, isn't it?
[quote name='pootle1' post='1088372' date='Jul 15 2010, 04:11 AM']And another thing, double (64 bit floats) is only on 1 of each 3 blocks of cores. Does this impact programming (that is, do you have to include code to allow for this), or does the cuda runtime automatically take care of it?



Together with the artificial crippling of the 64 bit engine, anything needing 64 bit floating math will probably run better on the CPU - or will have to be converted to fixed point maths - assuming one doesn't have a tesla unit lurking out of earshot.

Isn't every GF100 100+ GFlops peak in DP? And maybe GF104 is 100+ GFlops too? That's higher than any CPU you can get, isn't it?

#14
Posted 07/15/2010 05:36 PM   
[quote name='tmurray' post='1088533' date='Jul 15 2010, 11:36 AM']Isn't every GF100 100+ GFlops peak in DP? And maybe GF104 is 100+ GFlops too? That's higher than any CPU you can get, isn't it?[/quote]

Probably, since the imaginary Wikipedia specs for the as yet unreleased Sandy Bridge CPUs put the DP rate for all cores combined at 128 GFLOPS if you use Intel's new AVX instructions.
[quote name='tmurray' post='1088533' date='Jul 15 2010, 11:36 AM']Isn't every GF100 100+ GFlops peak in DP? And maybe GF104 is 100+ GFlops too? That's higher than any CPU you can get, isn't it?



Probably, since the imaginary Wikipedia specs for the as yet unreleased Sandy Bridge CPUs put the DP rate for all cores combined at 128 GFLOPS if you use Intel's new AVX instructions.

#15
Posted 07/15/2010 07:15 PM   
  1 / 4    
Scroll To Top