Intel compiler support for front-end CUDA compilation
I've read various threads in this forum, but none of them seem to actually point to a solution, so here's my take on it.

I use the Intel C/C++ compiler for our CPU code, and nvcc for the gpu code. It's cross-platform, Linux/Windows/Mac. I don't build with an IDE, so I just need to know what command-line args to pass to things.

Why do I need to do this? Why not just mix cl.exe and icl.exe-compiled CPU code? On Windows, with VS2005, if I use (for instance) the thrust template lib, it uses std::string in a few places on the host side in a .cu file, and math ops such as ceil() and floor(). That host-side code gets compiled with cl.exe (Microsoft) rather than icl.exe (Intel), which causes later link errors because the Intel compiler (or its libs, such as libmmds.lib) has its own implementations of various things like _ceil and _floor.

Basically I just need a way to tell nvcc to use icl.exe rather than cl.exe (they take the same args so this should work fine). But all I can find is a way to set the compiler bin *dir*, not the actual *name* of the compiler! What am I missing? I don't think this issue is Windows-specific, since nvcc on Linux takes the same option (--compiler-bindir), with no apparent way to set the name of the host compiler. (I found a file nvcc.profile, but as it's undocumented I couldn't figure out how to make it use a different host compiler.)
I've read various threads in this forum, but none of them seem to actually point to a solution, so here's my take on it.



I use the Intel C/C++ compiler for our CPU code, and nvcc for the gpu code. It's cross-platform, Linux/Windows/Mac. I don't build with an IDE, so I just need to know what command-line args to pass to things.



Why do I need to do this? Why not just mix cl.exe and icl.exe-compiled CPU code? On Windows, with VS2005, if I use (for instance) the thrust template lib, it uses std::string in a few places on the host side in a .cu file, and math ops such as ceil() and floor(). That host-side code gets compiled with cl.exe (Microsoft) rather than icl.exe (Intel), which causes later link errors because the Intel compiler (or its libs, such as libmmds.lib) has its own implementations of various things like _ceil and _floor.



Basically I just need a way to tell nvcc to use icl.exe rather than cl.exe (they take the same args so this should work fine). But all I can find is a way to set the compiler bin *dir*, not the actual *name* of the compiler! What am I missing? I don't think this issue is Windows-specific, since nvcc on Linux takes the same option (--compiler-bindir), with no apparent way to set the name of the host compiler. (I found a file nvcc.profile, but as it's undocumented I couldn't figure out how to make it use a different host compiler.)

#1
Posted 12/22/2009 02:12 PM   
As far as I've ever heard, there are only two supported (by nVidia) compilers: Visual Studio (on Windows) and gcc (on Linux). It might be a prudent idea for nVidia to add support for Intel's compilers though, since plenty of folks in the HPC/scientific community use them.
As far as I've ever heard, there are only two supported (by nVidia) compilers: Visual Studio (on Windows) and gcc (on Linux). It might be a prudent idea for nVidia to add support for Intel's compilers though, since plenty of folks in the HPC/scientific community use them.

GPU.NET: Write your GPU code in 100% pure C#.

Learn more at tidepowerd.com, and download a free 30-day trial of GPU.NET. Follow @tidepowerd for release updates.



GPU.NET example projects

#2
Posted 12/22/2009 07:51 PM   
[quote name='profquail' post='969031' date='Dec 22 2009, 02:51 PM']As far as I've ever heard, there are only two supported (by nVidia) compilers: Visual Studio (on Windows) and gcc (on Linux). It might be a prudent idea for nVidia to add support for Intel's compilers though, since plenty of folks in the HPC/scientific community use them.[/quote]

Yes.. I agree they are very effecient and one of the best compilers out there. Plus you can get a free copy of them for your personal use if you using Linux.

I hope Nvidia also does the same for the Intel Fortran compiler.. as rite now we have to [i][b]buy[/b][/i] the Fortran CUDA compiler from PGI group.

This prompted me to write my own wrapper library to CUDA for the host code using Fortran 2003 ISO binding feature, but still it would great to have Nvidia only Fortran support. But am almost sure this aint gonna happen anytime soon... /confused.gif' class='bbc_emoticon' alt=':confused:' />

Maybe they can make the PGI Fortran compiler free for Linux for personal use ? /teehee.gif' class='bbc_emoticon' alt=':teehee:' /> my wishlist....
[quote name='profquail' post='969031' date='Dec 22 2009, 02:51 PM']As far as I've ever heard, there are only two supported (by nVidia) compilers: Visual Studio (on Windows) and gcc (on Linux). It might be a prudent idea for nVidia to add support for Intel's compilers though, since plenty of folks in the HPC/scientific community use them.



Yes.. I agree they are very effecient and one of the best compilers out there. Plus you can get a free copy of them for your personal use if you using Linux.



I hope Nvidia also does the same for the Intel Fortran compiler.. as rite now we have to buy the Fortran CUDA compiler from PGI group.



This prompted me to write my own wrapper library to CUDA for the host code using Fortran 2003 ISO binding feature, but still it would great to have Nvidia only Fortran support. But am almost sure this aint gonna happen anytime soon... /confused.gif' class='bbc_emoticon' alt=':confused:' />



Maybe they can make the PGI Fortran compiler free for Linux for personal use ? /teehee.gif' class='bbc_emoticon' alt=':teehee:' /> my wishlist....

Lead.. the world will follow..!

#3
Posted 12/22/2009 11:42 PM   
[quote name='Gary O' post='968886' date='Dec 22 2009, 06:12 AM']Basically I just need a way to tell nvcc to use icl.exe rather than cl.exe (they take the same args so this should work fine). But all I can find is a way to set the compiler bin *dir*, not the actual *name* of the compiler! What am I missing? I don't think this issue is Windows-specific, since nvcc on Linux takes the same option (--compiler-bindir), with no apparent way to set the name of the host compiler. (I found a file nvcc.profile, but as it's undocumented I couldn't figure out how to make it use a different host compiler.)[/quote]

Have you tried making a copy of icl.exe and renaming it to be cl.exe? (And on Linux the corresponding approach would be to create a link.) This is an ugly hack for sure, but if it works, you could find out if the Intel compiler is unsupported because NVIDIA hasn't tested it, or if it is unsupported because NVIDIA relies on some behavior that is specific to the Microsoft compiler.

Jeremy Furtek
[quote name='Gary O' post='968886' date='Dec 22 2009, 06:12 AM']Basically I just need a way to tell nvcc to use icl.exe rather than cl.exe (they take the same args so this should work fine). But all I can find is a way to set the compiler bin *dir*, not the actual *name* of the compiler! What am I missing? I don't think this issue is Windows-specific, since nvcc on Linux takes the same option (--compiler-bindir), with no apparent way to set the name of the host compiler. (I found a file nvcc.profile, but as it's undocumented I couldn't figure out how to make it use a different host compiler.)



Have you tried making a copy of icl.exe and renaming it to be cl.exe? (And on Linux the corresponding approach would be to create a link.) This is an ugly hack for sure, but if it works, you could find out if the Intel compiler is unsupported because NVIDIA hasn't tested it, or if it is unsupported because NVIDIA relies on some behavior that is specific to the Microsoft compiler.



Jeremy Furtek

#4
Posted 12/23/2009 02:53 PM   
[quote name='nitin.life' post='969162' date='Dec 22 2009, 05:42 PM']Yes.. I agree they are very effecient and one of the best compilers out there. Plus you can get a free copy of them for your personal use if you using Linux.

I hope Nvidia also does the same for the Intel Fortran compiler.. as rite now we have to [i][b]buy[/b][/i] the Fortran CUDA compiler from PGI group.

This prompted me to write my own wrapper library to CUDA for the host code using Fortran 2003 ISO binding feature, but still it would great to have Nvidia only Fortran support. But am almost sure this aint gonna happen anytime soon... /confused.gif' class='bbc_emoticon' alt=':confused:' />

Maybe they can make the PGI Fortran compiler free for Linux for personal use ? /teehee.gif' class='bbc_emoticon' alt=':teehee:' /> my wishlist....[/quote]

Well, I think that the PGI Fortran compiler will be a paid solution for the foreseeable future, since that lets you compile FORTRAN code directly to CUDA (instead of writing your kernels in C and calling them from FORTRAN via the CUDA API). However, if you're writing code in C/C++, then nvcc handles the device code part of things, so I don't see why nVidia couldn't add support for the Intel compilers (on Windows and Linux) for the host-side code sometime in the future. Maybe not anytime soon, but eventually...if tmurray starts running out of ideas ;)

EDIT: Almost forgot...if you really, really need to use the Intel compilers now, you could always use nvcc to compile your kernels into PTX or cubin files, then use the driver API to call them. Not as easy as the normal approach, but it'll work.
[quote name='nitin.life' post='969162' date='Dec 22 2009, 05:42 PM']Yes.. I agree they are very effecient and one of the best compilers out there. Plus you can get a free copy of them for your personal use if you using Linux.



I hope Nvidia also does the same for the Intel Fortran compiler.. as rite now we have to buy the Fortran CUDA compiler from PGI group.



This prompted me to write my own wrapper library to CUDA for the host code using Fortran 2003 ISO binding feature, but still it would great to have Nvidia only Fortran support. But am almost sure this aint gonna happen anytime soon... /confused.gif' class='bbc_emoticon' alt=':confused:' />



Maybe they can make the PGI Fortran compiler free for Linux for personal use ? /teehee.gif' class='bbc_emoticon' alt=':teehee:' /> my wishlist....



Well, I think that the PGI Fortran compiler will be a paid solution for the foreseeable future, since that lets you compile FORTRAN code directly to CUDA (instead of writing your kernels in C and calling them from FORTRAN via the CUDA API). However, if you're writing code in C/C++, then nvcc handles the device code part of things, so I don't see why nVidia couldn't add support for the Intel compilers (on Windows and Linux) for the host-side code sometime in the future. Maybe not anytime soon, but eventually...if tmurray starts running out of ideas ;)



EDIT: Almost forgot...if you really, really need to use the Intel compilers now, you could always use nvcc to compile your kernels into PTX or cubin files, then use the driver API to call them. Not as easy as the normal approach, but it'll work.

GPU.NET: Write your GPU code in 100% pure C#.

Learn more at tidepowerd.com, and download a free 30-day trial of GPU.NET. Follow @tidepowerd for release updates.



GPU.NET example projects

#5
Posted 12/23/2009 04:31 PM   
Support for ICC is on our radar.
Support for ICC is on our radar.

#6
Posted 12/23/2009 05:13 PM   
[quote name='profquail' post='969483' date='Dec 23 2009, 10:31 AM']EDIT: Almost forgot...if you really, really need to use the Intel compilers now, you could always use nvcc to compile your kernels into PTX or cubin files, then use the driver API to call them. Not as easy as the normal approach, but it'll work.[/quote]

Actually, I do this now (on a platform where I'm locked to a particular CUDA version that only supports gcc 3.4) with the runtime API by exposing a plain C interface to the rest of my code. When I compile the .cu file to an object file, I use nvcc -ccbin=/usr/bin to force CUDA to use the system gcc 3.4 for the host side, even though I have gcc 4.2 as the default compiler in my path.

Obviously, this makes the most sense for largish applications where you are compiling many source files. Simple apps in a single .cu file can't benefit from this technique.
[quote name='profquail' post='969483' date='Dec 23 2009, 10:31 AM']EDIT: Almost forgot...if you really, really need to use the Intel compilers now, you could always use nvcc to compile your kernels into PTX or cubin files, then use the driver API to call them. Not as easy as the normal approach, but it'll work.



Actually, I do this now (on a platform where I'm locked to a particular CUDA version that only supports gcc 3.4) with the runtime API by exposing a plain C interface to the rest of my code. When I compile the .cu file to an object file, I use nvcc -ccbin=/usr/bin to force CUDA to use the system gcc 3.4 for the host side, even though I have gcc 4.2 as the default compiler in my path.



Obviously, this makes the most sense for largish applications where you are compiling many source files. Simple apps in a single .cu file can't benefit from this technique.

#7
Posted 12/23/2009 05:20 PM   
[quote name='seibert' post='969513' date='Dec 23 2009, 12:20 PM']Actually, I do this now (on a platform where I'm locked to a particular CUDA version that only supports gcc 3.4) with the runtime API by exposing a plain C interface to the rest of my code. When I compile the .cu file to an object file, I use nvcc -ccbin=/usr/bin to force CUDA to use the system gcc 3.4 for the host side, even though I have gcc 4.2 as the default compiler in my path.

Obviously, this makes the most sense for largish applications where you are compiling many source files. Simple apps in a single .cu file can't benefit from this technique.[/quote]

Right -- and even if you have a large app (as I do) and segment out the CUDA stuff, once you start using thrust or other CUDA libs that generate significant host-side code, that code gets compiled by the compiler selected by nvcc no matter what. I may try the 'renaming icl to cl' trick at some point, but in a production software environment that's a bit questionable :-).
[quote name='seibert' post='969513' date='Dec 23 2009, 12:20 PM']Actually, I do this now (on a platform where I'm locked to a particular CUDA version that only supports gcc 3.4) with the runtime API by exposing a plain C interface to the rest of my code. When I compile the .cu file to an object file, I use nvcc -ccbin=/usr/bin to force CUDA to use the system gcc 3.4 for the host side, even though I have gcc 4.2 as the default compiler in my path.



Obviously, this makes the most sense for largish applications where you are compiling many source files. Simple apps in a single .cu file can't benefit from this technique.



Right -- and even if you have a large app (as I do) and segment out the CUDA stuff, once you start using thrust or other CUDA libs that generate significant host-side code, that code gets compiled by the compiler selected by nvcc no matter what. I may try the 'renaming icl to cl' trick at some point, but in a production software environment that's a bit questionable :-).

#8
Posted 12/23/2009 05:29 PM   
I have been using Intel's compiler on both Windows and Linux for quite a while with the Runtime API without a problem. The trick is to only keep the code calling the kernel in .cu files, to prevent exactly the sort of problems that you are getting.

Anyway, for a fast workaround, try taking as much code out of the .cu files as you can. In fact, if you can only keep the kernels and the code that invokes the kernel (<<<grid, block>>> stuff), then do so.

[quote name='profquail' post='969483' date='Dec 23 2009, 11:31 AM']EDIT: Almost forgot...if you really, really need to use the Intel compilers now, you could always use nvcc to compile your kernels into PTX or cubin files, then use the driver API to call them. Not as easy as the normal approach, but it'll work.[/quote]

I just converted an app from the runtime to driver APIs. It's not as difficult as it seems at first, and you get the advantage of being able to dynamically link to the driver library. I think most developers shoud use the driver API.
I have been using Intel's compiler on both Windows and Linux for quite a while with the Runtime API without a problem. The trick is to only keep the code calling the kernel in .cu files, to prevent exactly the sort of problems that you are getting.



Anyway, for a fast workaround, try taking as much code out of the .cu files as you can. In fact, if you can only keep the kernels and the code that invokes the kernel (<<<grid, block>>> stuff), then do so.



[quote name='profquail' post='969483' date='Dec 23 2009, 11:31 AM']EDIT: Almost forgot...if you really, really need to use the Intel compilers now, you could always use nvcc to compile your kernels into PTX or cubin files, then use the driver API to call them. Not as easy as the normal approach, but it'll work.



I just converted an app from the runtime to driver APIs. It's not as difficult as it seems at first, and you get the advantage of being able to dynamically link to the driver library. I think most developers shoud use the driver API.

#9
Posted 12/25/2009 11:09 PM   
[quote name='Mr_Nuke' post='970539' date='Dec 25 2009, 06:09 PM']I have been using Intel's compiler on both Windows and Linux for quite a while with the Runtime API without a problem. The trick is to only keep the code calling the kernel in .cu files, to prevent exactly the sort of problems that you are getting.
Anyway, for a fast workaround, try taking as much code out of the .cu files as you can. In fact, if you can only keep the kernels and the code that invokes the kernel (<<<grid, block>>> stuff), then do so.[/quote]

I agree -- unfortunately something as simple as using the thrust template lib or raising an exception with a string causes incompatible host-side code to get compiled into the .cu file.

And as for switching to the driver API, maybe if I didn't have hundreds of kernels to convert... :-)
[quote name='Mr_Nuke' post='970539' date='Dec 25 2009, 06:09 PM']I have been using Intel's compiler on both Windows and Linux for quite a while with the Runtime API without a problem. The trick is to only keep the code calling the kernel in .cu files, to prevent exactly the sort of problems that you are getting.

Anyway, for a fast workaround, try taking as much code out of the .cu files as you can. In fact, if you can only keep the kernels and the code that invokes the kernel (<<<grid, block>>> stuff), then do so.



I agree -- unfortunately something as simple as using the thrust template lib or raising an exception with a string causes incompatible host-side code to get compiled into the .cu file.



And as for switching to the driver API, maybe if I didn't have hundreds of kernels to convert... :-)

#10
Posted 12/28/2009 01:47 PM   
[quote name='Gary O' post='971553' date='Dec 28 2009, 08:47 AM']I agree -- unfortunately something as simple as using the thrust template lib or raising an exception with a string causes incompatible host-side code to get compiled into the .cu file.

And as for switching to the driver API, maybe if I didn't have hundreds of kernels to convert... :-)[/quote]

Here's another thought. Use nvcc to generate C source files, then feed those to the intel compiler.
[quote name='Gary O' post='971553' date='Dec 28 2009, 08:47 AM']I agree -- unfortunately something as simple as using the thrust template lib or raising an exception with a string causes incompatible host-side code to get compiled into the .cu file.



And as for switching to the driver API, maybe if I didn't have hundreds of kernels to convert... :-)



Here's another thought. Use nvcc to generate C source files, then feed those to the intel compiler.

#11
Posted 12/29/2009 04:49 PM   
[quote name='Mr_Nuke' post='972159' date='Dec 29 2009, 07:49 PM']Here's another thought. Use nvcc to generate C source files, then feed those to the intel compiler.[/quote]


How? I used the -keep option to keep all the files generated, but there are a lot of .c , .cpp, .h , .gpu, and other types files. I don't know which files should I feed to icl. What should I do now? I was working on the Visual studio and now I need to use icl, but I don't know how to connect all these generated files using command line.


Thanks
[quote name='Mr_Nuke' post='972159' date='Dec 29 2009, 07:49 PM']Here's another thought. Use nvcc to generate C source files, then feed those to the intel compiler.





How? I used the -keep option to keep all the files generated, but there are a lot of .c , .cpp, .h , .gpu, and other types files. I don't know which files should I feed to icl. What should I do now? I was working on the Visual studio and now I need to use icl, but I don't know how to connect all these generated files using command line.





Thanks

#12
Posted 01/27/2010 06:57 AM   
Scroll To Top