Thar she blows... CUDA 8 RC available

allanmac · May 27, 2016, 6:36pm

https://developer.nvidia.com/cuda-release-candidate-download

The CUDA 8.0 RC version is “8.0.27” while the EA release was “8.0.21”.

Gregory_Diamos · May 27, 2016, 6:40pm

Thanks @allanmac, that picture made my day.

allanmac · May 27, 2016, 6:59pm

You’re probably one of the people in the boat manning the bug report harpoons! ^_^

New arch types listed by ptxas: ‘compute_60’,‘compute_61’,‘compute_62’,‘sm_60’,‘sm_61’,‘sm_62’

New PTX instructions in sm_61: dp4a (“Four-way byte dot product-accumulate”) and dp2a (“Two-way dot product-accumulate”)

Global atom and red PTX instructions also get scope modifiers: { .cta, .gpu, .sys }.

Release Notes here.

BulatZiganshin · May 27, 2016, 8:56pm

for me, it finally supports msvc2015 (hey, c++14) and moderngpu 2.0

sm 6.2 is probably tegra. i believe that all desktop gpus from gp102 to gp106 will be sm 6.1

BulatZiganshin · May 27, 2016, 9:20pm

Release Notes said “Windows Server 2008 R2 Support. Support for Windows Server 2008 R2 is now
deprecated and will be removed in a future version of the CUDA toolkit.”

does it mean compilation or runtime environment?

Robert_Crovella · May 27, 2016, 9:47pm

It pretty much means both.

Future toolkits may not provide the support for compiling with this as a target environment.
Future toolkits may not provide the necessary support (e.g. libraries) to run codes with this as a target environment.

CudaaduC · May 27, 2016, 10:30pm

Built some of my main benchmark projects against CUDA 8 and find that it seems to be modestly faster than CUDA 7.5 for my small sample.

The verbose compilation output looks about the same as CUDA 7.5 in terms of registers used etc.

Was not able to get my hands on a GTX 1080 so those tests will wait until that GPU are available.

So far so good. Look forward to using the graph library.

scottgray · May 28, 2016, 1:11am

If the 1080 is indeed an sm_61 part and has the dp4a and dp2a instructions I’ll be pretty happy. I was actually more interested in those than fp16 support. And I didn’t even know about the dp2a instruction. Could come in handy if 8 bits isn’t quite enough for one of the operands.

ldaddr · May 28, 2016, 1:52am

Yummy ^_^. So many dev tool updates out this month… Going to be a fun weekend :-D

BulatZiganshin · May 28, 2016, 5:10am

1080 is definitely Compute Capability 6.1 - it was mentioned on nvidia site

Lewei · May 28, 2016, 1:05pm

@scottgray

Hi Scott,
Got any plans for making a pascal assembler?

scottgray · May 28, 2016, 8:36pm

@Lewei

maxas requires no changes to support Pascal. Though I just need to add the new op codes. Many of the fp16 codes were already added for sm_53.

ChuckSommer · May 29, 2016, 3:13am

I looked at the 8.0 documentation “Programming Guide” and was disappointed in not seeing Pascal documented in the tables (2 & 12).
Table 2 in section 5.4.1 Arithmetic Instructions
(-- only covers Compute Capability Fermi, Kepler and Maxwell).
Table 12. Feature Support per Compute Capability in section G Compute Capabilities
(-- only covers Compute Capability Fermi, Kepler and Maxwell).
Also would like to see the first 2 figures to reflect progress since 2013 Aug (Maxwell 5.0)