I coded an app that uses the runtime API (cuda* functions), but it is very desireable for me to have control over the context parameters, and pushing and popping them. Can I add cuCtx* calls into my cuda* code, without breaking anything? Or should I migrate everything to work with the low level driver API directly? According to the docs, they are mutually exclusive so I should not use them together. Thoughts?
As far as i know it isn’t possible, you will get a conflict. I ended up with 2 versions of host code because of this. But i for one prefer the driver code.
I have writen a small library that reimplements the cuda runtime calls in order to call the cuda driver API. It allows to use the driver API and at the same time compile the code using nvcc with the kernel_name<<< >>>() syntax.
I will make it public soon, if you are interested, just ask me and I can send you the code.