Calling CUDA functions from a C file

Say my compnay has got a huge systems consist of many C files. Now i’m trying to call cuda functions from one of the file (to speedup those functions). And changing all the C files into CU files is not an option, as i have no control over them. So is there any efficient way to call cuda functions from within a C file (not CU file) and access their outputs? how is it done?

so far I have seen that similar idea was already developed for Matlab, cuda functions can be called directly from matlab environment and the outputs are accessible. But is this possible for ordinary C files?

in your .c file

kernel_wrapper(a,b,c);

your .cu file

__global__ void kernel(int a, int b, int c) {

   some calculations

}

void kernel_wrapper(int a, int b, int c) {

    dimGrid, dimBlock;

    kernel<<<dimGrid, dimBlock>>>(a,b,c);

}

That is all and the simplest way to do so. You can also make more wrappers to copy back your memory but this can also be done from the .c file because the cudaMemcpy’ are c compatible or how you call them… :P

Note: you need to include the cuda.h or cudart.h in you .c files to make use of them and it is also needed to make an .h file for you .cu file

Hope this helps.

OR

Create shortcuts named “CU” for all the C files and then compile the shortcuts :-))

Bah… Thats the stupid ass windows way right? :P

I love windows or mac for Office machine but for experimentation linux is the way to go.

sorry I don’t quite get how it works

ok, so i’m a linux user, first how do you compile your .c and .cu file?

do you compile them into the same executable?

here i got a simplest program i can think of

a.c

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <cuda.h>

extern void kernel_wrapper(int *a, int *b);

int main(int argc, char *argv[])

{

    int a = 2;

    int b = 3;

   kernel_wrapper(&a, &b);

   return 0;

}

b.cu

__global__ void kernel(int *a, int *b)

{

    int tx = threadIdx.x;

   switch( tx )

    {

	case 0:

     *a = *a + 10;

     break;

	case 1:

     *b = *b + 3;

     break;

	default:

     break;

    }

}

void kernel_wrapper(int *a, int *b)

{

    int *d_1, *d_2;

    dim3 threads( 2, 1 );

    dim3 blocks( 1, 1 );

   cudaMalloc( (void **)&d_1, sizeof(int) );

    cudaMalloc( (void **)&d_2, sizeof(int) );

   cudaMemcpy( d_1, a, sizeof(int), cudaMemcpyHostToDevice );

    cudaMemcpy( d_2, b, sizeof(int), cudaMemcpyHostToDevice );

   kernel<<< blocks, threads >>>( a, b );

   cudaMemcpy( a, d_1, sizeof(int), cudaMemcpyDeviceToHost );

    cudaMemcpy( b, d_2, sizeof(int), cudaMemcpyDeviceToHost );

   cudaFree(d_1);

    cudaFree(d_2);

}

I am not able to make the kernel_wrapper callable from the C file, what can i do to make it work?

make a b.h like this:

#ifndef __B__

#define __B__

#include "cuda.h"

#include "cuda_runtime.h"

extern "C" void kernel_wrapper(int *a, int *b);

#endif

I’m not sure about the includes here but you can keep them for now

in your a.c also include “b.h”

those cudaMalloc and memcpy can you also use in your a.c if you want.

I htink if you do it like this it will work.

I didn’t test it but try it.

And about compiling it:

I compile my .cu files to a library shared or not shared and link them with the .c files

but you can also compile them to .o and make sure to link those in you link those when using gcc or g++

Hello guys, :)

I’m following this topic because I also have some problems with calling .cu functions from a .c file. I’m working in vs2005.

I followed the steps decribed in this topic.

I made a header file for the .cu function.
The header file is included in both .c and .cu.
The additional libraries/headers are also included in the .c file.
cuda.h, not the “cutil.h”

The functions of c and cu are very simply.

But this error always occures:

error LNK2019: unresolved external symbol _Cudafunction ni function _main

file:
line:

error fatal erro LNK1120: 1 unresolved external

file: template.exe
line: 1

I started from the template project of the nvidia cuda site. But I think there are still some libraries that need to be included. But wihich one I don’t know.

Do you guys have a idea for a solution? External Image

“Shortcut” does NOT necessarily imply windows. You could do the same in Linux with “Hard” or “Soft” Links. In Linux, it just takes a shell script to set the links right and you can keep going…

I dont understand why you find this method stupid. I dont see anything stupid in this method.

Are you an ex-microsoft employee? :-)

btw,

Can some1 tell if VS2005 can compile files that are actually shortcuts?

Hello me again

Found a sort of solution for the .c call .cu issue.

I use a c++ wrapper.

.c => c++ => .cu

If anyone has a better solution please tell me :)

hi, i haven’t been able to make this work yet. I’m not good in compiling shared library, I’ll need more specific instructions to do it lol

as for compiling them into .o files, i’ve tried including b.h in my a.c file, the compilation failed as gcc cannot recognize the line "extern “C” void … " , when i compile with “nvcc -o test a.o b.o” the compiler also fail as it cannot find the function kernel_wrapper(). I’m pretty stuck and confused here.

would you mind state more explicitly how did you do it? i hope you don’t mind to put in a simple example please. Thank you.

No prob, :)

You need two .cu files, one .cc or .cpp file and one .c file that contains the “main” function. I started from the cppIntegration project on the cuda sdk example projects.

The .cu file (host functions):

// includes, system

#include <stdlib.h>

// includes, project

#include <cutil.h>

// includes, kernels

#include <cuda_kernel_A.cu>

//host function

extern "C" void HostfuncA(const int argc, const char** argv)

{

CUT_DEVICE_INIT();

kernel call ....

}

The .cpp file:

// includes, system

#include <iostream>

#include "cutil.h"

//declaration of .cu func

extern "C" void HostfuncA(const int argc, const char** argv, more args...);

extern "C" void WrapperA(int argc, char** argv);

extern "C" void WrapperA(int argc, char** argv)

{

HostfuncA( argc, argv);

CUT_EXIT( argc, argv);

}

The .c file:

// includes, system

#include <stdlib.h>

#include <stdio.h>

extern void WrapperA(int argc, char** argv);

////////////////////////////////////////////////////////////////////////////////

// Program main

////////////////////////////////////////////////////////////////////////////////

int main(int argc, char** argv)

{

	WrapperA( argc, argv);

	return;

}

This is what I did in vs2005 for calling a .cu function from a .c file.

The following is based on three year ago work using W2K and XP.

Windows shortcuts are implemented at the WIN32 GUI or maybe kernel layer, far above the file system. They are little files interpretted above the file system layer. So most software will work OK with them, certainly GUI software should because GUI s/w uses Windows standard dialog boxes which do the right thing.

Once you get down to the command line in cmd.exe you can run into problems where your program thinks you want the file implementing the shortcut, not the thing the shortcut points to. Also the ANSI C and C++ libraries will definitely choke on shortcuts.

UNIX / Linux links are functions of the file system, an app gets what the file system tells it, and must specifically ask “Is this really a link?”

Hi

As I found that your c++ file contains only c programs, I changed it so that you don’t have to go through a c++ wrapper. My full codes (with Makefile) listed below.

b.h

#ifndef __B_H_

#define __B_H_

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <cuda.h>

#include <cuda_runtime.h>

#endif

b.cu

#include "b.h"

extern "C" void kernel_wrapper(int *a);

__global__ void kernel(int *a)

{

    int tx = threadIdx.x;

    

    switch( tx )

    {

	case 0:

     a[tx] = a[tx] + 2;

     break;

	case 1:

     a[tx] = a[tx] + 3;

     break;

    }

}

void kernel_wrapper(int *a)

{

    int *d_a;

    dim3 threads( 2, 1 );

    dim3 blocks( 1, 1 );

   cudaMalloc( (void **)&d_a, sizeof(int) * 2 );

   cudaMemcpy( d_a, a, sizeof(int) * 2, cudaMemcpyHostToDevice );

   kernel<<< blocks, threads >>>( d_a );

   cudaMemcpy( a, d_a, sizeof(int) * 2, cudaMemcpyDeviceToHost );

   printf( "Finish kernel wrapper\n" );

    cudaFree(d_a);

}

a.c

#include "b.h"

extern void kernel_wrapper(int *a);

int main(int argc, char *argv[])

{

    int *a = (int *)malloc(sizeof(int) * 2);

    a[0] = 2;

    a[1] = 3;

    

    printf( "a[0]: %d, a[1]: %d\n", a[0], a[1] );

    kernel_wrapper(a);

    printf( "a[0]: %d, a[1]: %d\n", a[0], a[1] );

   free(a);

    return 0;

}

Makefile

run: a.o b.o

	gcc -L /usr/local/cuda/lib -lcudart -o run a.o b.o

a.o: a.c b.h

	gcc -I /usr/local/cuda/include -c -o a.o a.c

b.o: b.cu b.h

	nvcc -c -o b.o b.cu

in fact, the main problem i faced before was the compilation problem. So when i included the correct libraries path and compile, the cuda program can then be correctly called from C file. :)

Thanks for the solution :)

It works good in vs2005. External Image

and what if instead to be .c files are .cpp files? Should it work ?
Let’s say we have main.cpp, a.cpp, b.cpp, c.cpp, a.h and b.h. We would like to introduce a kernel like we have seen in b.cu.

How would we do it?

and what if instead to be .c files are .cpp files? Should it work ?
Let’s say we have main.cpp, a.cpp, b.cpp, c.cpp, a.h and b.h. We would like to introduce a kernel like we have seen in b.cu.

How would we do it?

Hi,

I did everything exactly as shinkee has explained but still it did not work. It gives error - cant include “b.h”

are the files in the same directory? alternately, you can include the absolute path to the .h file you’re trying to include, but I wouldn’t recommend that as a long-term solution. It should be very simple - you’re really just trying to invoke an extern method

Hello guys,
Is there any way to make a host function call in device (I mean inside device code).

(1) Host code can call global_ functions
(2) global functions can call device functions
(3) device code cannot call host functions