I wrote a code which use MKL and CUBLAS functions.
The MKL functions used are the geqrf and the larft functions.
The problem is as follow :
When I compile with icc the execution time of the geqrf function takes 4062 ms, whereas with nvcc, it takes 61959 ms, 20x more …
For the larft function, it takes 3522 ms with icc and 8104 ms with nvcc.
I need to use this function, I know there is a CULA geqrf version but just for single precision.
I would like to test my code in double precision and so, use dgeqrf from Mkl …
Maybe MKL’s function aren’t optimized with nvcc … ?
The C of your cuda file is compiled by gcc by default, so it may not optimize what you want.
If you want it to be compiled with icc you have to pass the “-ccbin=icc” option to nvcc
If you haven’t apply the patch to the intel math.h you will probably encounter compilation error.
And if you use double complex cublas fonctions, you will get errors because of a difference of interpretation on 16B aligned pointers between gcc based code (as cublas is compiled with gcc) and icc based code.
/usr/local/cuda/bin/…/include/host_config.h(108): catastrophic error: #error directive: – unsupported ICC configuration! Only ICC 11.1 on Linux x86_64 is supported! #error – unsupported ICC configuration! Only ICC 11.1 on Linux x86_64 is supported!
^
This error seems quite explicit, your version of icc is too old.
An other workaround i didn’t mention is to cpmile everything with icc.
nvcc compiler is mandatory only for kernel definition and call.
if you have only Cuda API and cublas fonctions, you can compile wwithout nvcc.
You will have to include “cuda_runtime.h” and “cublas.h” in your C file, and specifie the include dir, lib dir and link with -lcublas -lcudart -lcuda
If you are using CUBLAS and MKL, why are you compiling with nvcc at all? nvcc is not required to use CUBLAS.
If you have actual device code which needs to be compiled, put it in a separate .cu file containing a C/C++ wrapper function to access the code, and compile that with nvcc, then link the resulting object file with icc. People have been using MKL and CUBLAS together forever without a problem (all those TOP500 Linpack results, for example).
So take the kernel out of the compilation unit shared the mkl and cublas calls, compile the CUDA code separately with nvcc, then link them afterwards. Problem solved.
/usr/local/cuda/include/host_config.h(108): catastrophic error: #error directive: – unsupported ICC configuration! Only ICC 11.1 on Linux x86_64 is supported!
#error – unsupported ICC configuration! Only ICC 11.1 on Linux x86_64 is supported!
As has been said twice already don’t use icc with nvcc. You have an unsupported version of icc. But that doesn’t matter. Just compile the device code with nvcc+gcc, and the rest of your code with icc. Link your device code with the icc output and mkl and cublas and you are done.
The error message clearly says you are trying to compile CUDA code with icc. It is being generated by a macro inside a CUDA system header. So what have you included into that code that is bring CUDA headers into the compilation? To use cublas you need to include cublas.h and nothing else.
In your icc compiled code, then use callkernel to launch the kernel. Then link the resulting object from nvcc with the icc code. That is all there is to it.