Getting CUDA and LAPACK svd working together in Visual Studio 2013

I’m trying to incorporate the LAPACK svd function into a Cuda project in Visual Studio 2013. I start by creating a new Cuda 6.5 project in Visual Studio. I then adjust my project properties to point to the appropriate Cuda and LAPACK include and library files. Then I implement the following file (I’ll break it up into the Cuda parts and the Lapack parts below for discussion purposes).

Cuda Parts:

#include "cuda.h"
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include < stdio.h>

__global__ void add(int a, int b, int *c)
{
	*c = a + b;
}

int main(void) // From Cuda By Example (Section 3.2.3)
{
	int c;
	int *dev_c;
	cudaMalloc((void **)&dev_c, sizeof(int));
	add << <1, 1 >> >(2, 7, dev_c);
	cudaMemcpy(&c, dev_c, sizeof(int), cudaMemcpyDeviceToHost);
	printf("2 + 7 = %d\n", c);
	cudaFree(dev_c);
	return 0;
}

Lapack parts

double AB[9] = { 0.1362, 4.1016, 0.7223, 4.1024, 125.7590, 22.3711, 0.7224, 22.3707, 4.0545 }; // Column major format...
	int m = 3; 	int n = 3; int lda = 3;
	double *s = (double *)malloc(3 * sizeof s[0]);
	double *u = (double *)malloc(m*m * sizeof u[0]);
	double *vt = (double *)malloc(n*n * sizeof vt[0]);
	////////////////////////////////////////////////////////
	// Workspace and status variables:
	double workSize;
	double *work = &workSize;
	int lwork = -1;
	int *iwork = (int *)malloc(8 * 3 * sizeof iwork[0]);
	int info = 0;
	dgesdd_("A", &m, &n, AB, &lda, s, u, &m, vt, &n, work, &lwork, iwork, &info); // Call dgesdd_ with lwork = -1 to query optimal workspace size:
	lwork = (int)workSize;
	work = (double *)malloc(lwork * sizeof work[0]);
	//////////////////////////////////////////////////////////////////
	dgesdd_("A", &m, &n, AB, &lda, s, u, &m, vt, &n, work, &lwork, iwork, &info); // Perform actual svd calc
	printf("s(0) = %6.4f\n", s[0]); // Just check you're getting about 129.xxx
	// Cleanup workspace:
	free(work);
	free(iwork);
	free(s);
	free(u);
	free(vt);
	return 0;
}

Here’s my problem. If I include just the ‘Cuda Parts’ in a file that ends with ‘.cu’ the nvcc compiler compiles the code just fine (2+7 = 9) great. If I include just the ‘Lapack Parts’ in the same file - but change the extension to ‘.c’ the nvcc compiler will compile that and the svd calc works. However, if I try to combine the Cuda & Lapack parts into a single main function and compile the file as a ‘.cu’ file, I get the error:

error : identifier "dgesdd_" is undefined

(If I try to compile the file as a ‘.c’ file, I get syntax errors upon making the add<1,1>>> kernel call.)

I have also tried to move the Lapack parts to a *.c file in a function called mysvd() and keep the Cuda parts in the main function w/in the *.cu file. When I try to compile I get the error:

error LNK2019: unresolved external symbol "int __cdecl mysvd(void)" (?mysvd@@YAHXZ) referenced in function _main

And when I try to put the main function in my *.c file with the Lapack code and call the Cuda related code in a function called mycud(), I get the compile error:

error LNK2019: unresolved external symbol _mycud referenced in function _main

It seems to me that I’m having issues with the project properties I’m using in Visual Studio. The compiler is treating the Lapack (or Cuda) code differently based on the file extension. And perhaps as a separate issue, my linker is not finding the functions in the neighboring files w/in the project (I have included the proper function prototypes at the top of each calling function - so I would think the linker would look in these files at link time…) Can anyone see what I’m doing wrong? Maybe it’s a linker setting…

I don’t see any include statements that would relate to lapack, so I’m not surprised the compiler throws the first error that dgesdd__ is undefined.

@txbob: Thanks for the reply - but that’s part of the confusion. When I compile just the LAPACK parts, I don’t require any lapack headers. (not even a prototype for the dgesdd_ function). And yet it compiles and runs just fine.

I have followed the directions at LAPACK for Windows for incorpating LAPACK as dynamic libraries (since I don’t have an Intel compiler). Futhermore, the only headers that are available for download there are related to lapacke - but I’m trying to implement all this w/out using this higher level interface…

Did some more work on this and found a way forward. I ended up separating the LAPACK and CUDA parts into a *.c and a *.cu file, respectively. Then I identify the LAPACK c subroutine in the Cuda file by pre-pending the function name with an ‘extern “C”’ directive. So I added the line:

extern “C” int mysvd(void);

to my *.cu file. I had originally tried just ‘extern int mysvd(void);’ - but that didn’t work. Needed the “C” to identify the code as c.

Just wanted to document this here in case anyone else runs into a similar issue.