I had developed random forest trainer using the gtx 980 with cuda 7.5 on the windows 8.1 and vs 2013.
that was working well. (thanks to gtx 980)
And I bought gtx 1070 and installed cuda 8.0. so I expected my program is working well too.
However, I got a error from cudaMalloc function. and error was cudaErrorUnknown :(
I don’t know how to handle this problem.
My environment is windows 10, vs2013 and cuda 8.0.
compute_35 is a virtual architecture though, and the resulting PTX should JIT compile for both GTX 980 and GTX 1070. Note that I do not recommend using JIT compilation unless necessary, it is not necessary here. Simply build a fat binary which incorporates the machine code for all relevant GPU architectures.
I think we are very much on the same page. My comment was meant to express a point of clarification, not disagreement. I would encourage OP to read the CUDA documentation on how to compile for virtual and physical architectures.