TITAN X

CUDA developers that wanted more memory bandwidth than what the 256bit GTX 980 gives for their CUDA applications can now consider the 384bit TITAN X. Full details to be released at GTC 2015.

Wonder if GM200 is Compute Capability 5.2 like GM204/GM206 or higher.

Glad to see that the rumors appear to be true.

A wish there was a pre-order list, because judging by the release of the GTX 980 these probably will be hard to get for a month or so.There were ‘speculators’ on NewEgg who would buy batches at price x, then sell on Amazon at price 1.25*x.

The big mystery will be the DP 64 bit performance, and will it maintain the same 32:64 ratio as it’s predecessors?

The DP ratio won’t be the same as Kepler’s 1/3 since the Maxwell architecture is laid out in twin rows of ALUs as opposed to triples. So a natural FP64 ratio for Maxwell would be 1/2, if NVidia wanted to concentrate on a DP performant part. But the rumors point to it being a 1/32 ratio like GM204, allowing more FP32/graphics horsepower. This rumor is subtly supported by the recent release of the GPGPU-only GK210 in K80 which is designed to be a CUDA workhorse. If a FP64 Maxwell beast were coming so soon, why make the K80 at all?

Titan X will probably not fly off the shelves like the GTX970 and GTX980 (which were and still are a huge bang for their buck in both graphics and compute.) The Titan line has always been a prestige part with prestige pricing. $1350 is a more likely price point, or even $1500. If it followed previous release history, there would probably be a $650 6GB GTX1080 consumer version perhaps in the fall.

Another worst kept secret, but glad to see it all ‘confirmed’. Hopefully it is a single precision monster with all that tasty tasty memory bandwidth.

some gaming related performance specs:

These specs would indicate this GPU is going to clock in at 6.156 TFLOPS in single precision. I wonder what the double-precision performance will be like?

At the 1/32 throughput ratio of other Maxwell GPUs, using double-single computation looks attractive compared to double precision, as even a high-quality implementation of double-single should start to make sense from a performance perspective around the 1/24 mark. Although moving to double-single computation would strike me as going backward in time (and there is the issue of limited range, barely enough to hold Planck’s constant).

Norbert, would the cost-effective transition point for using double-single depend on the use case?
Ie, is there a difference in the number of FP FMAs needed to compute A+= B*C as opposed to a simpler A+=B or A-=B?

I could imagine some important applications which would benefit from the latter summation-only use. For example, a particle simulation (N-body, or atomic force simulations) might store particle positions using double-single. To compute accurate forces between particles, it may be sufficient to calculate the difference between points (A-B) using Kahan double-single and from that point use the difference as an argument to a single-precision force computation.

For high-quality double-single based on FMA the addition/subtraction is actually more expensive than the multiplication. A bunch of double-single code out there takes shortcuts in addition/subtraction, leading to low-accuracy results when the operands are close in magnitude, but of opposite sign, i.e. subtractive cancellation occurs. In other words, they fail the programmer precisely in those situations where improved accuracy is needed.

If memory serves, with FMA support a double-single multiplication is only about 8 instructions, while addition/subtraction are about 20. I do not offhand recall the cost of division, sqrt, rsqrt. I have not worked through the details of any double-native operations beyond that.

Why did I state that the switchover vs native-double is about 1:24 ratio? Because the code bloat caused by double-single also has some negative impact on performance: more registers used, instruction cache hit rate may decline, possible divergence, more difficult for the compiler to optimize.

As you point out, compensated operations (sums, products, dot products, polynomials) often can provide some (or even most) of the benefit of double-native implementations at an attractive fraction of the cost. See the recent thread on compensated operations for literature references:

[url]https://devtalk.nvidia.com/default/topic/815711/cuda-programming-and-performance/observations-related-to-32-bit-floating-point-ops-and-64-bit-floating-point-ops/[/url]

But compensated algorithms apply only to some primitives, and require analysis as to where they need to be used.

I was surprised to see rumors of a GTX 980ti model:

http://vr-zone.com/articles/benchmarks-upcoming-nvidia-gtx-980-ti-titan-x-amd-radeon-r9-390x-gpus-surface/88776.html

Still have no idea about the actual 64 bit capability of either the Titan X or the GTX 980ti. The single precision flops, bandwidth and power TDP seem to be known.

Maybe this week at GTC there will be more information.

I doubt we will see a full GM200 GTX until late Q2 at the earliest (unless nvidia need to reply to the 390 with something drastic). It doesn’t make much sense to undermine the Titan X sales with a card that generally will be as powerful (potential DP and memory aside).

The notion of a cut down GM200 GTX however is relatively appealing relatively soon. Nvidia could do with something to compete against the 390 when it appears, and having a card that can make use of the chips with a couple of SMs broken is an easy way of upping ‘yield’. In a couple of months time it can be assumed that the 390 will overtake the 980, so needing a fight back that is not needing a full Titan X is something very appealing. It also is not something unusual, since nvidia did the same thing with the 680->780->780 Ti on Kepler, so while the timescales are different (due to the timing of the Titan X more than anything) the principle is there. Personally I think they will not announce a new GTX at GTC (too many similar 3 letter names … ) but will wait to see what the 390 is like. If it beats the 980, then suddenly the 985 will be announced. If not, then maybe it will be a few months until it gets announced.

The other potential thought provoking thing is that if they DO release a 980 Ti (full GM200 GTX) at the same time as the Titan X that means that they have something else in the pipeline that might make an appearance late in the year. My thought process there is they aren’t going to leave themselves with no short-term or medium term releases until Pascal by releasing the only remaining top end Maxwell card early so they have to sit with nothing to release until Pascal is ready.

as SPWorley already alluded to, the rebut to the argument might be, since its entry into gpgpu, or simply hpc, nvidia now has to follow and counter more competitors on more fronts

it has to counter offerings related to graphics, and gpgpu/ hpc, and it has to counter offerings by amd as well as intel

hence, it may no longer be sufficient to merely look at amd, to forecast what nvidia’s next move may be…

perhaps

OpenCL (for some reason) benchmarks for Titan X posted:

and this set of more general benchmarks:

Anyone know if the TITAN X is an sm_52 device or sm_53?

The official list at [url]https://developer.nvidia.com/cuda-gpus[/url] has not been updated yet, and I have not been able to find this information in the reviews up so far (which are gaming oriented, so not a surprise that compute capability isn’t mentioned).

OK, thanks. I’ll try to find a TITAN X on the exhibit floor. :)

It is 5.2

Watching the webcast he announced the price and specs but no dates. Is there any information at the event of when the Titan X will be available?

BTW are there any titan owners around? Geforce Titan, Titan Z, Titan X anyone?

Christian

We have about ten Maxwell Titans (mixture of Titans and Titan Blacks) here at work. We looked briefly at the Titan Z but decided against it. Primary reason was no higher bandwidth communication between the two chips, had they had GDDR5 speeds between the two chip memory banks then it would’ve been a different story. Planning on R&D testing the new Titan X when we can get hold of them to check performance on our specific tasks.

GTX Titan is is available on the Nvidia website starting now (in the US). According to AnandTech retailers will be selling the card in a couple of weeks.

I’ve got an original Titan, at work. Why do you ask, any plans?