Intel compiler support for front-end CUDA compilation

I’ve read various threads in this forum, but none of them seem to actually point to a solution, so here’s my take on it.

I use the Intel C/C++ compiler for our CPU code, and nvcc for the gpu code. It’s cross-platform, Linux/Windows/Mac. I don’t build with an IDE, so I just need to know what command-line args to pass to things.

Why do I need to do this? Why not just mix cl.exe and icl.exe-compiled CPU code? On Windows, with VS2005, if I use (for instance) the thrust template lib, it uses std::string in a few places on the host side in a .cu file, and math ops such as ceil() and floor(). That host-side code gets compiled with cl.exe (Microsoft) rather than icl.exe (Intel), which causes later link errors because the Intel compiler (or its libs, such as libmmds.lib) has its own implementations of various things like _ceil and _floor.

Basically I just need a way to tell nvcc to use icl.exe rather than cl.exe (they take the same args so this should work fine). But all I can find is a way to set the compiler bin dir, not the actual name of the compiler! What am I missing? I don’t think this issue is Windows-specific, since nvcc on Linux takes the same option (–compiler-bindir), with no apparent way to set the name of the host compiler. (I found a file nvcc.profile, but as it’s undocumented I couldn’t figure out how to make it use a different host compiler.)

As far as I’ve ever heard, there are only two supported (by nVidia) compilers: Visual Studio (on Windows) and gcc (on Linux). It might be a prudent idea for nVidia to add support for Intel’s compilers though, since plenty of folks in the HPC/scientific community use them.

Yes… I agree they are very effecient and one of the best compilers out there. Plus you can get a free copy of them for your personal use if you using Linux.

I hope Nvidia also does the same for the Intel Fortran compiler… as rite now we have to buy the Fortran CUDA compiler from PGI group.

This prompted me to write my own wrapper library to CUDA for the host code using Fortran 2003 ISO binding feature, but still it would great to have Nvidia only Fortran support. But am almost sure this aint gonna happen anytime soon… External Media

Maybe they can make the PGI Fortran compiler free for Linux for personal use ? External Media my wishlist…

Have you tried making a copy of icl.exe and renaming it to be cl.exe? (And on Linux the corresponding approach would be to create a link.) This is an ugly hack for sure, but if it works, you could find out if the Intel compiler is unsupported because NVIDIA hasn’t tested it, or if it is unsupported because NVIDIA relies on some behavior that is specific to the Microsoft compiler.

Jeremy Furtek

Well, I think that the PGI Fortran compiler will be a paid solution for the foreseeable future, since that lets you compile FORTRAN code directly to CUDA (instead of writing your kernels in C and calling them from FORTRAN via the CUDA API). However, if you’re writing code in C/C++, then nvcc handles the device code part of things, so I don’t see why nVidia couldn’t add support for the Intel compilers (on Windows and Linux) for the host-side code sometime in the future. Maybe not anytime soon, but eventually…if tmurray starts running out of ideas ;)

EDIT: Almost forgot…if you really, really need to use the Intel compilers now, you could always use nvcc to compile your kernels into PTX or cubin files, then use the driver API to call them. Not as easy as the normal approach, but it’ll work.

Support for ICC is on our radar.

Actually, I do this now (on a platform where I’m locked to a particular CUDA version that only supports gcc 3.4) with the runtime API by exposing a plain C interface to the rest of my code. When I compile the .cu file to an object file, I use nvcc -ccbin=/usr/bin to force CUDA to use the system gcc 3.4 for the host side, even though I have gcc 4.2 as the default compiler in my path.

Obviously, this makes the most sense for largish applications where you are compiling many source files. Simple apps in a single .cu file can’t benefit from this technique.

Right – and even if you have a large app (as I do) and segment out the CUDA stuff, once you start using thrust or other CUDA libs that generate significant host-side code, that code gets compiled by the compiler selected by nvcc no matter what. I may try the ‘renaming icl to cl’ trick at some point, but in a production software environment that’s a bit questionable :-).

I have been using Intel’s compiler on both Windows and Linux for quite a while with the Runtime API without a problem. The trick is to only keep the code calling the kernel in .cu files, to prevent exactly the sort of problems that you are getting.

Anyway, for a fast workaround, try taking as much code out of the .cu files as you can. In fact, if you can only keep the kernels and the code that invokes the kernel (<<<grid, block>>> stuff), then do so.

I just converted an app from the runtime to driver APIs. It’s not as difficult as it seems at first, and you get the advantage of being able to dynamically link to the driver library. I think most developers shoud use the driver API.

I agree – unfortunately something as simple as using the thrust template lib or raising an exception with a string causes incompatible host-side code to get compiled into the .cu file.

And as for switching to the driver API, maybe if I didn’t have hundreds of kernels to convert… :-)

Here’s another thought. Use nvcc to generate C source files, then feed those to the intel compiler.

How? I used the -keep option to keep all the files generated, but there are a lot of .c , .cpp, .h , .gpu, and other types files. I don’t know which files should I feed to icl. What should I do now? I was working on the Visual studio and now I need to use icl, but I don’t know how to connect all these generated files using command line.

Thanks