bbudge
April 4, 2007, 8:05pm
1
Hi all -
We’re all used to compilers providing defines based on the environment, i.e. whether we are running in linux, win32, win64, whether we are using gcc, and which version, or whether we are using vc++, etc…
For example, we know we are in win32 if
#ifdef _WIN32
and we know that the GCC version is 4 if
#if (GNUC == 4).
Does NVCC provide anything like this?
I would like to have a function which is like
inline host device blah() {
#ifdef NVCC
//do some code
#else
//do some SSE code
#endif
}
Of course, I could define my own macros when nvcc is used, but it would be nice/neat/clean if NVCC had something like this.
Brian
You will need to split it in a device function and an host function (with different names).
There is a CUDACC define, but it could not be used to do what you want.
Massimiliano
Hi all -
We’re all used to compilers providing defines based on the environment, i.e. whether we are running in linux, win32, win64, whether we are using gcc, and which version, or whether we are using vc++, etc…
For example, we know we are in win32 if
ifdef _WIN32
and we know that the GCC version is 4 if
#if (GNUC == 4).
Does NVCC provide anything like this?
I would like to have a function which is like
inline host device blah() {
ifdef NVCC
//do some code
else
//do some SSE code
endif
}
Of course, I could define my own macros when nvcc is used, but it would be nice/neat/clean if NVCC had something like this.
Brian
[snapback]179867[/snapback]
bbudge
April 5, 2007, 1:38am
4
Thanks! This is what I was looking for.
Brian
It will not work for what you are trying to do. You will need to split the function.
bbudge
April 6, 2007, 1:54am
6
It works for me. It will not work in general, but for my problem it is sufficient.
bbudge
April 6, 2007, 5:28am
7
To avoid confusion for any future readers of the thread, I am only using the CUDACC macro to decide if nvcc is compiling the code, or if I am using a C/C++ compiler. If you have code that should be run differently (i.e. SSE optimizations) on CPU or GPU in cuda code, you should use two separate functions, one host and one device .
You will need to split it in a device function and an host function (with different names).
There is a CUDACC define, but it could not be used to do what you want.
Sigh, this thread is more than 2 years old and the problem is still current.
I wrote a class which is suppsoed to work on both the host and the device.
Unfortunately I have an overloaded multiplication operator *, which happens to call into __fmul_rn(). Therefore it won’t compile on the host.
There is apparently no way to overload the * operator separately for the host , to provide a “host-safe” implementation.
Can this be fixed in the compiler to allow for separate host and device operator overloads?
Christian