Calling a class from cuda-kernel
Hello,

i am doing my first steps with CUDA. My goal is to use CUDA in my project to use the CPU and GPU for the calculation. The first kernel was successfully compiled and runs perfect. I 'improved' that kernel by using a simple class. So, the problem begins because only inline methods can be called from the kernel. I used the following code:

[codebox]#include <stdio.h>
#include <cuda.h>
#include "math/TestClass.h"

__global__ void square_array(float *a, int N)
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;

if (idx<N)
{
math::TestClass test(39.0f);
a[idx] = test.doSomething();
}
}[/codebox]

[codebox]#ifndef _TESTCLASS_H_
#define _TESTCLASS_H_

#include <cuda.h>

#ifdef GPU_COMPILATION
#ifndef GPU_DEVICE
#define GPU_DEVICE __device__ __host__
#endif
#else
#ifndef GPU_DEVICE
#define GPU_DEVICE
#endif
#endif

namespace math
{

class TestClass
{
private:
float myVar;

public:
TestClass(float var)
: myVar(var)
{
}

GPU_DEVICE float doSomething();

GPU_DEVICE float doSomethingInline()
{
return myVar;
}
};

}

#endif[/codebox]

[codebox]#include "TestClass.h"

namespace math
{

float TestClass::doSomething()
{
return myVar;
}

}[/codebox]

I can use doSomethingInline() without any problems, but doSomething() produces the following error:
[codebox]warning: function "math::TestClass::doSomething" was referenced but not defined
Error: External calls are not supported (found non-inlined call to _ZN4math9TestClass11doSomethingEv)[/codebox]

Of course, I can only write inline-methods. But there are already a lot of complete classes in my project, splitted in *.cpp and .h.
Can anyone tell me how i can use classes (without inheritance, ...) in the CUDA-kernel? :-)
Hello,



i am doing my first steps with CUDA. My goal is to use CUDA in my project to use the CPU and GPU for the calculation. The first kernel was successfully compiled and runs perfect. I 'improved' that kernel by using a simple class. So, the problem begins because only inline methods can be called from the kernel. I used the following code:



[codebox]#include <stdio.h>

#include <cuda.h>

#include "math/TestClass.h"



__global__ void square_array(float *a, int N)

{

int idx = blockIdx.x * blockDim.x + threadIdx.x;



if (idx<N)

{

math::TestClass test(39.0f);

a[idx] = test.doSomething();

}

}[/codebox]



[codebox]#ifndef _TESTCLASS_H_

#define _TESTCLASS_H_



#include <cuda.h>



#ifdef GPU_COMPILATION

#ifndef GPU_DEVICE

#define GPU_DEVICE __device__ __host__

#endif

#else

#ifndef GPU_DEVICE

#define GPU_DEVICE

#endif

#endif



namespace math

{



class TestClass

{

private:

float myVar;



public:

TestClass(float var)

: myVar(var)

{

}



GPU_DEVICE float doSomething();



GPU_DEVICE float doSomethingInline()

{

return myVar;

}

};



}



#endif[/codebox]



[codebox]#include "TestClass.h"



namespace math

{



float TestClass::doSomething()

{

return myVar;

}



}[/codebox]



I can use doSomethingInline() without any problems, but doSomething() produces the following error:

[codebox]warning: function "math::TestClass::doSomething" was referenced but not defined

Error: External calls are not supported (found non-inlined call to _ZN4math9TestClass11doSomethingEv)[/codebox]



Of course, I can only write inline-methods. But there are already a lot of complete classes in my project, splitted in *.cpp and .h.

Can anyone tell me how i can use classes (without inheritance, ...) in the CUDA-kernel? :-)

#1
Posted 07/29/2009 10:49 AM   
Ha, I'm surprised inline methods work :)

CUDA supports only the C subset of C++, classes are technically illegal in kernels. Templates are the only C++-ish part that's allowed in device code.
Ha, I'm surprised inline methods work :)



CUDA supports only the C subset of C++, classes are technically illegal in kernels. Templates are the only C++-ish part that's allowed in device code.

#2
Posted 07/29/2009 03:48 PM   
[quote name='_Big_Mac_' post='571468' date='Jul 29 2009, 11:48 AM']Ha, I'm surprised inline methods work :)

CUDA supports only the C subset of C++, classes are technically illegal in kernels. Templates are the only C++-ish part that's allowed in device code.[/quote]

Ok, substitute "struct" for "class" in the OP's code.

Does anyone know how to go about doing this? I have the exact same question, as I have a number of classes that I can easily convert to structs, but they are defined in sets of .h files and .cpp files. I am unable to get it to work with the struct definition in a separate .cu file from the struct declaration (when only including the header file). Thanks to anyone who can shed some light on this,

-Jeff
[quote name='_Big_Mac_' post='571468' date='Jul 29 2009, 11:48 AM']Ha, I'm surprised inline methods work :)



CUDA supports only the C subset of C++, classes are technically illegal in kernels. Templates are the only C++-ish part that's allowed in device code.



Ok, substitute "struct" for "class" in the OP's code.



Does anyone know how to go about doing this? I have the exact same question, as I have a number of classes that I can easily convert to structs, but they are defined in sets of .h files and .cpp files. I am unable to get it to work with the struct definition in a separate .cu file from the struct declaration (when only including the header file). Thanks to anyone who can shed some light on this,



-Jeff

#3
Posted 09/10/2009 03:06 PM   
[quote name='Jeff Gullett' post='586763' date='Sep 10 2009, 05:06 PM']Ok, substitute "struct" for "class" in the OP's code.

Does anyone know how to go about doing this? I have the exact same question, as I have a number of classes that I can easily convert to structs, but they are defined in sets of .h files and .cpp files. I am unable to get it to work with the struct definition in a separate .cu file from the struct declaration (when only including the header file). Thanks to anyone who can shed some light on this,

-Jeff[/quote]
My simple but effective trick: Include the .cpp at the end of the header via preprocessor.

[codebox]#ifndef _TESTCLASS_H_
#define _TESTCLASS_H_

#include "../gpu/GPUDefines.h"

namespace math
{

class TestClass
{
private:
float myVar;

public:
TestClass(float var)
: myVar(var)
{
}

GPU_HOST_AND_DEVICE float doSomething();

GPU_HOST_AND_DEVICE float doSomethingInline()
{
return myVar;
}
};

}

#ifdef COMPILE_FOR_GPU
#include "TestClass.cpp"
#endif

#endif[/codebox]

[codebox]#ifndef COMPILE_FOR_GPU
#include "TestClass.h"
#endif

namespace math
{

float TestClass::doSomething()
{
return myVar * 10;
}

}[/codebox]

[codebox]#ifndef _GPUDEFINES_H_
#define _GPUDEFINES_H_

#ifdef COMPILE_FOR_GPU
# include <cuda.h>
# ifndef GPU_DEVICE
# define GPU_DEVICE __device__
# endif
# ifndef GPU_HOST
# define GPU_HOST __host__
# endif
# ifndef GPU_HOST_AND_DEVICE
# define GPU_HOST_AND_DEVICE __device__ __host__
# endif
#else
# ifndef GPU_DEVICE
# define GPU_DEVICE
# endif
# ifndef GPU_HOST
# define GPU_HOST
# endif
# ifndef GPU_HOST_AND_DEVICE
# define GPU_HOST_AND_DEVICE
# endif
#endif

#endif[/codebox]
Works perfect with our classes. All things, that are not cuda-compatible (functions using the stl, ...) can be 'removed' for nvcc with #ifndef COMPILE_FOR_GPU. All code can be used for CPU and in CUDA.
[quote name='Jeff Gullett' post='586763' date='Sep 10 2009, 05:06 PM']Ok, substitute "struct" for "class" in the OP's code.



Does anyone know how to go about doing this? I have the exact same question, as I have a number of classes that I can easily convert to structs, but they are defined in sets of .h files and .cpp files. I am unable to get it to work with the struct definition in a separate .cu file from the struct declaration (when only including the header file). Thanks to anyone who can shed some light on this,



-Jeff

My simple but effective trick: Include the .cpp at the end of the header via preprocessor.



[codebox]#ifndef _TESTCLASS_H_

#define _TESTCLASS_H_



#include "../gpu/GPUDefines.h"



namespace math

{



class TestClass

{

private:

float myVar;



public:

TestClass(float var)

: myVar(var)

{

}



GPU_HOST_AND_DEVICE float doSomething();



GPU_HOST_AND_DEVICE float doSomethingInline()

{

return myVar;

}

};



}



#ifdef COMPILE_FOR_GPU

#include "TestClass.cpp"

#endif



#endif[/codebox]



[codebox]#ifndef COMPILE_FOR_GPU

#include "TestClass.h"

#endif



namespace math

{



float TestClass::doSomething()

{

return myVar * 10;

}



}[/codebox]



[codebox]#ifndef _GPUDEFINES_H_

#define _GPUDEFINES_H_



#ifdef COMPILE_FOR_GPU

# include <cuda.h>

# ifndef GPU_DEVICE

# define GPU_DEVICE __device__

# endif

# ifndef GPU_HOST

# define GPU_HOST __host__

# endif

# ifndef GPU_HOST_AND_DEVICE

# define GPU_HOST_AND_DEVICE __device__ __host__

# endif

#else

# ifndef GPU_DEVICE

# define GPU_DEVICE

# endif

# ifndef GPU_HOST

# define GPU_HOST

# endif

# ifndef GPU_HOST_AND_DEVICE

# define GPU_HOST_AND_DEVICE

# endif

#endif



#endif[/codebox]

Works perfect with our classes. All things, that are not cuda-compatible (functions using the stl, ...) can be 'removed' for nvcc with #ifndef COMPILE_FOR_GPU. All code can be used for CPU and in CUDA.

#4
Posted 09/10/2009 04:20 PM   
Unfortunately, like most "features" of CUDA, this is not the answer I was hoping for. Including the *.cu files using the pre-processor is what I have currently implemented. This technique has the unfortunate side-effect of preventing incremental builds, but at least it works. Thanks,

-Jeff
Unfortunately, like most "features" of CUDA, this is not the answer I was hoping for. Including the *.cu files using the pre-processor is what I have currently implemented. This technique has the unfortunate side-effect of preventing incremental builds, but at least it works. Thanks,



-Jeff

#5
Posted 09/10/2009 04:32 PM   
[quote name='Jeff Gullett' post='586786' date='Sep 10 2009, 05:32 PM']Unfortunately, like most "features" of CUDA, this is not the answer I was hoping for. Including the *.cu files using the pre-processor is what I have currently implemented. This technique has the unfortunate side-effect of preventing incremental builds, but at least it works. Thanks,

-Jeff[/quote]

Hi Jeff and others,
Recently, I also ran into the same problem. I have many structs and standalone functions that I like to compile separately and then link to the kernel, but I was getting the "External calls are not supported" error while compiling (not linking) the kernel. It's very frustrating!! If you have figured out a way to achieve incremental compilation, please share.

Thanks
[quote name='Jeff Gullett' post='586786' date='Sep 10 2009, 05:32 PM']Unfortunately, like most "features" of CUDA, this is not the answer I was hoping for. Including the *.cu files using the pre-processor is what I have currently implemented. This technique has the unfortunate side-effect of preventing incremental builds, but at least it works. Thanks,



-Jeff



Hi Jeff and others,

Recently, I also ran into the same problem. I have many structs and standalone functions that I like to compile separately and then link to the kernel, but I was getting the "External calls are not supported" error while compiling (not linking) the kernel. It's very frustrating!! If you have figured out a way to achieve incremental compilation, please share.



Thanks

#6
Posted 02/28/2010 01:06 AM   
[quote name='aamir' post='1009242' date='Feb 27 2010, 09:06 PM']Hi Jeff and others,
Recently, I also ran into the same problem. I have many structs and standalone functions that I like to compile separately and then link to the kernel, but I was getting the "External calls are not supported" error while compiling (not linking) the kernel. It's very frustrating!! If you have figured out a way to achieve incremental compilation, please share.

Thanks[/quote]

Nope, no other solution from my end... although it sounds like the new Fermi architecture being released later this month may solve many of the coding problems developers faced in the past when using CUDA, since it allegedly supports the full C++ standard (from what I have heard... I leave it to you to find a reliable source for this information.)
[quote name='aamir' post='1009242' date='Feb 27 2010, 09:06 PM']Hi Jeff and others,

Recently, I also ran into the same problem. I have many structs and standalone functions that I like to compile separately and then link to the kernel, but I was getting the "External calls are not supported" error while compiling (not linking) the kernel. It's very frustrating!! If you have figured out a way to achieve incremental compilation, please share.



Thanks



Nope, no other solution from my end... although it sounds like the new Fermi architecture being released later this month may solve many of the coding problems developers faced in the past when using CUDA, since it allegedly supports the full C++ standard (from what I have heard... I leave it to you to find a reliable source for this information.)

#7
Posted 03/01/2010 03:01 AM   
Scroll To Top