defines automatically defined by NVCC?
Hi all -

We're all used to compilers providing defines based on the environment, i.e. whether we are running in linux, win32, win64, whether we are using gcc, and which version, or whether we are using vc++, etc...

For example, we know we are in win32 if

#ifdef _WIN32

and we know that the GCC version is 4 if

#if (__GNUC__ == 4).

Does NVCC provide anything like this?

I would like to have a function which is like

__inline__ __host__ __device__ blah() {
#ifdef NVCC
//do some code
#else
//do some SSE code
#endif
}

Of course, I could define my own macros when nvcc is used, but it would be nice/neat/clean if NVCC had something like this.

Brian
Hi all -



We're all used to compilers providing defines based on the environment, i.e. whether we are running in linux, win32, win64, whether we are using gcc, and which version, or whether we are using vc++, etc...



For example, we know we are in win32 if



#ifdef _WIN32



and we know that the GCC version is 4 if



#if (__GNUC__ == 4).



Does NVCC provide anything like this?



I would like to have a function which is like



__inline__ __host__ __device__ blah() {

#ifdef NVCC

//do some code

#else

//do some SSE code

#endif

}



Of course, I could define my own macros when nvcc is used, but it would be nice/neat/clean if NVCC had something like this.



Brian

#1
Posted 04/04/2007 08:05 PM   
One macro is __CUDACC__: [url="http://forums.nvidia.com/index.php?showtopic=30371"]conditional compilation for nvcc/c++ compiler[/url]

Paulius
One macro is __CUDACC__: conditional compilation for nvcc/c++ compiler



Paulius

#2
Posted 04/04/2007 11:00 PM   
You will need to split it in a device function and an host function (with different names).
There is a __CUDACC__ define, but it could not be used to do what you want.

Massimiliano

[quote name='bbudge' date='Apr 4 2007, 01:05 PM']Hi all -

We're all used to compilers providing defines based on the environment, i.e. whether we are running in linux, win32, win64, whether we are using gcc, and which version, or whether we are using vc++, etc...

For example, we know we are in win32 if

#ifdef _WIN32

and we know that the GCC version is 4 if

#if (__GNUC__ == 4).

Does NVCC provide anything like this?

I would like to have a function which is like

__inline__ __host__ __device__ blah() {
#ifdef NVCC
  //do some code
#else
  //do some SSE code
#endif
}

Of course, I could define my own macros when nvcc is used, but it would be nice/neat/clean if NVCC had something like this.

  Brian
[right][snapback]179867[/snapback][/right]
[/quote]
You will need to split it in a device function and an host function (with different names).

There is a __CUDACC__ define, but it could not be used to do what you want.



Massimiliano



[quote name='bbudge' date='Apr 4 2007, 01:05 PM']Hi all -



We're all used to compilers providing defines based on the environment, i.e. whether we are running in linux, win32, win64, whether we are using gcc, and which version, or whether we are using vc++, etc...



For example, we know we are in win32 if



#ifdef _WIN32



and we know that the GCC version is 4 if



#if (__GNUC__ == 4).



Does NVCC provide anything like this?



I would like to have a function which is like



__inline__ __host__ __device__ blah() {

#ifdef NVCC

  //do some code

#else

  //do some SSE code

#endif

}



Of course, I could define my own macros when nvcc is used, but it would be nice/neat/clean if NVCC had something like this.



  Brian

[snapback]179867[/snapback]


#3
Posted 04/04/2007 11:11 PM   
[quote name='paulius' date='Apr 4 2007, 03:00 PM']One macro is __CUDACC__: [url="http://forums.nvidia.com/index.php?showtopic=30371"]conditional compilation for nvcc/c++ compiler[/url]

Paulius
[right][snapback]179913[/snapback][/right]
[/quote]

Thanks! This is what I was looking for.

Brian
[quote name='paulius' date='Apr 4 2007, 03:00 PM']One macro is __CUDACC__: conditional compilation for nvcc/c++ compiler



Paulius

[snapback]179913[/snapback]






Thanks! This is what I was looking for.



Brian

#4
Posted 04/05/2007 01:38 AM   
It will not work for what you are trying to do. You will need to split the function.
It will not work for what you are trying to do. You will need to split the function.

#5
Posted 04/05/2007 04:03 AM   
[quote name='mfatica' date='Apr 4 2007, 08:03 PM']It will not work for what you are trying to do. You will need to split the function.
[right][snapback]179974[/snapback][/right]
[/quote]

It works for me. It will not work in general, but for my problem it is sufficient.
[quote name='mfatica' date='Apr 4 2007, 08:03 PM']It will not work for what you are trying to do. You will need to split the function.

[snapback]179974[/snapback]






It works for me. It will not work in general, but for my problem it is sufficient.

#6
Posted 04/06/2007 01:54 AM   
To avoid confusion for any future readers of the thread, I am only using the __CUDACC__ macro to decide if nvcc is compiling the code, or if I am using a C/C++ compiler. If you have code that should be run differently (i.e. SSE optimizations) on CPU or GPU in cuda code, you should use two separate functions, one __host__ and one __device__.
To avoid confusion for any future readers of the thread, I am only using the __CUDACC__ macro to decide if nvcc is compiling the code, or if I am using a C/C++ compiler. If you have code that should be run differently (i.e. SSE optimizations) on CPU or GPU in cuda code, you should use two separate functions, one __host__ and one __device__.

#7
Posted 04/06/2007 05:28 AM   
[quote name='mfatica' post='179915' date='Apr 5 2007, 01:11 AM']You will need to split it in a device function and an host function (with different names).
There is a __CUDACC__ define, but it could not be used to do what you want.[/quote]

Sigh, this thread is more than 2 years old and the problem is still current.

I wrote a class which is suppsoed to work on both the host and the device.

Unfortunately I have an overloaded multiplication operator *, which happens to call into __fmul_rn(). Therefore it won't compile on the host.

There is apparently no way to overload the * operator separately for the __host__, to provide a "host-safe" implementation.

Can this be fixed in the compiler to allow for separate __host__ and __device__ operator overloads?

Christian
[quote name='mfatica' post='179915' date='Apr 5 2007, 01:11 AM']You will need to split it in a device function and an host function (with different names).

There is a __CUDACC__ define, but it could not be used to do what you want.



Sigh, this thread is more than 2 years old and the problem is still current.



I wrote a class which is suppsoed to work on both the host and the device.



Unfortunately I have an overloaded multiplication operator *, which happens to call into __fmul_rn(). Therefore it won't compile on the host.



There is apparently no way to overload the * operator separately for the __host__, to provide a "host-safe" implementation.



Can this be fixed in the compiler to allow for separate __host__ and __device__ operator overloads?



Christian

#8
Posted 10/18/2009 08:11 PM   
Scroll To Top