MATLAB Mex CUDA final solution a tutorial

Hello All,
I am sitting on this problem for the 3rd day.
I have tried to compile various examples from various places.
The furthest i managed to get is :

test.obj : fatal error LNK1112: module machine type ‘X86’ conflicts with target machine type ‘x64’

I have tried to use compile script from various sources, then edit them myself, i have modified mex.pl and various nvmex.pl versions. and
NOTHING!

Can someone finally explain what the hell does that mean?

My software is:

Windows XP 64bit professional
Visual Studio C++ 2008 Express edition
Matlab 2009a 64bit (standard C to mex works)
Matlab 2010a 64bit (standard C to mex works)
MS Windows SDK v6.1 as described in [url=“How can I set up Microsoft Visual Studio 2008 Express Edition for use with MATLAB 7.7 (R2008b) on 64-bit Windows? - MATLAB Answers - MATLAB Central”]http://www.mathworks.com/support/solutions...lution=1-6IJJ3L[/url]
CUDA 3.0 64bit

I am close to give up. please help.
I have my kernel (ultrasonic beam simulation) already written and verified in m-script and plain C code, it works as mexw64. It is already optimised for localized memory access(const memory).
All i need now is a wrapper.

Will provide detailed error messages if neccesary.

please help . . …

Hi All,

I have installed gpu-mat toolbox from http://www.gp-you.org./index.php and, interesingly it contains .m to gpu-enabled mex compiler in it. Yes, a m-script compiler!
And it works on my configuration (in the previous post).

Upon closer examination, the “.m” compiler simply packs a linear list of operations into a .cpp file and then compiles it using plain mex. Not bad, but that’s still a series of operations instead of custom optimised kernel.

I am not sure how it works yet, but it must hold the answer. Stay tuned for updates.

OK, an update.

The library above doesn’t compile cuda at all, it merely links a list of precompiled cuda kernels. Nice, however this is not what i need.

In the end i installed a 32bit version of matlab, 32bit libraries etc. . . . and they work no problem.

I still don’t know how to make 64bit versions work . . .

my ultrasonic beam calculator achieves 270x speedup over matlab, and still a 130x speedup over compiled C version ( Gt9500 over i7-720@ 2.8GHz ). Refreshing. Now i have a good excuse to buy the Gf470 :-)

OK, an update.

The library above doesn’t compile cuda at all, it merely links a list of precompiled cuda kernels. Nice, however this is not what i need.

In the end i installed a 32bit version of matlab, 32bit libraries etc. . . . and they work no problem.

I still don’t know how to make 64bit versions work . . .

my ultrasonic beam calculator achieves 270x speedup over matlab, and still a 130x speedup over compiled C version ( Gt9500 over i7-720@ 2.8GHz ). Refreshing. Now i have a good excuse to buy the Gf470 :-)

That sounds like you’re linking with the 32-bit CUDA instead of the 64-bit.

Does your mex line look something like

mex(‘-largeArrayDims’, ‘YOURFILENAMEHERE’, ‘-LC:\CUDA\lib64’, ‘-lcudart’);

i wasn’t using ‘largeArrayDims’ but regarding the lib64, I am pretty sure i had ONLY the lib64 folder at that time so it couldn’t link to anything else. I was trying to put that path into various places and it didn’t work.

Still, now that i have written my own version of ‘compiling script’ i might try it again with 64bit matlab. I’ll report if i do.

ok, here is my final solution. Workes fine for WinXP64 bit and 32-bit MATLAB, also works fine on Windows 7 64-bit and 64-bit MATLAB.

Didn’t work on WinXP 64 bit and 64-bit Matlab, i don’t know why.

% nvc(filename) compile cuda via intermediate .obj file

% '.cu' extension is already added

% in case if nvcc complains about some .h or .lim files, simply add them to

% the "nvcc.profile" file where they belong

% regarding the options, simply comment or uncomment relevant line as

% needed

% 2010-07 Jerzy Dziewierz, CUE, Strathclyde University

% Public domain, but please keep this short comment 

function nvc(filename)

% make cuda .obj file first

options='-gencode=arch=compute_10,code=sm_10 -gencode=arch=compute_10,code=compute_10 -gencode=arch=compute_20,code=sm_20 -gencode=arch=compute_20,code=compute_20';

options=[options ' --use_fast_math'];

%options=[options ' -keep'];

txt=sprintf('c:\MATLAB2010a\CUDA\bin64\nvcc %s.cu %s -c -lcufft -lcudart -lcuda --ptxas-options=-v -Ic:\MATLAB2010a\extern\include\',filename,options);

system(txt)

%mex_options='-g'; % to include debug info 

mex_options='-O'; % enable optimisation

n=getenv('CUDA_LIB_PATH');

mex(['-L' n],mex_options,'-lcudart','-lcufft','-lcuda',sprintf('%s.obj',filename));

delete(sprintf('%s.obj',filename));

usage: (assume that you have example.cu )

nvc example

no need to be in your code folder, works from anywhere within Matlab’s path.

Hey all, now that MATLAB supports CUDA kernals directly, many of you have probably given up on using the nvmex compiler.

However, for portability issues, I still think it is best to write your own stand-alone mex file with CUDA code.

I have been using nvmex for my research and have the most up-to-date nvmex with setup instructions at www.nlsemagic.com.