When will cuSolver move the sparse host-only functions to device implementations?

In particular I am interested in the sparse LU factorization(not incomplete) and the cusolverSpCcsrlsvlu() function.

Will this be part of CUDA 8.0? Find it odd that there are only single threaded host versions for these.