MATLAB Mex CUDA final solution a tutorial
Hello All,
I am sitting on this problem for the 3rd day.
I have tried to compile various examples from various places.
The furthest i managed to get is :

test.obj : fatal error LNK1112: module machine type 'X86' conflicts with target machine type 'x64'

I have tried to use compile script from various sources, then edit them myself, i have modified mex.pl and various nvmex.pl versions. and
NOTHING!

Can someone finally explain what the hell does that mean?

My software is:

Windows XP 64bit professional
Visual Studio C++ 2008 Express edition
Matlab 2009a 64bit (standard C to mex works)
Matlab 2010a 64bit (standard C to mex works)
MS Windows SDK v6.1 as described in [url="http://www.mathworks.com/support/solutions/en/data/1-6IJJ3L/index.html?solution=1-6IJJ3L"]http://www.mathworks.com/support/solutions...lution=1-6IJJ3L[/url]
CUDA 3.0 64bit

I am close to give up. please help.
I have my kernel (ultrasonic beam simulation) already written and verified in m-script and plain C code, it works as mexw64. It is already optimised for localized memory access(const memory).
All i need now is a wrapper.

Will provide detailed error messages if neccesary.

please help . . ..
Hello All,

I am sitting on this problem for the 3rd day.

I have tried to compile various examples from various places.

The furthest i managed to get is :



test.obj : fatal error LNK1112: module machine type 'X86' conflicts with target machine type 'x64'



I have tried to use compile script from various sources, then edit them myself, i have modified mex.pl and various nvmex.pl versions. and

NOTHING!



Can someone finally explain what the hell does that mean?



My software is:



Windows XP 64bit professional

Visual Studio C++ 2008 Express edition

Matlab 2009a 64bit (standard C to mex works)

Matlab 2010a 64bit (standard C to mex works)

MS Windows SDK v6.1 as described in http://www.mathworks.com/support/solutions...lution=1-6IJJ3L

CUDA 3.0 64bit



I am close to give up. please help.

I have my kernel (ultrasonic beam simulation) already written and verified in m-script and plain C code, it works as mexw64. It is already optimised for localized memory access(const memory).

All i need now is a wrapper.



Will provide detailed error messages if neccesary.



please help . . ..

#1
Posted 06/23/2010 10:31 PM   
Hi All,

I have installed gpu-mat toolbox from [url="http://www.gp-you.org./index.php"]http://www.gp-you.org./index.php[/url] and, interesingly it contains .m to gpu-enabled mex compiler in it. Yes, a m-script compiler!
And it works on my configuration (in the previous post).

Upon closer examination, the ".m" compiler simply packs a linear list of operations into a .cpp file and then compiles it using plain mex. Not bad, but that's still a series of operations instead of custom optimised kernel.

I am not sure how it works yet, but it must hold the answer. Stay tuned for updates.
Hi All,



I have installed gpu-mat toolbox from http://www.gp-you.org./index.php and, interesingly it contains .m to gpu-enabled mex compiler in it. Yes, a m-script compiler!

And it works on my configuration (in the previous post).



Upon closer examination, the ".m" compiler simply packs a linear list of operations into a .cpp file and then compiles it using plain mex. Not bad, but that's still a series of operations instead of custom optimised kernel.



I am not sure how it works yet, but it must hold the answer. Stay tuned for updates.

#2
Posted 06/24/2010 06:06 AM   
[quote name='vigilant' post='1077271' date='Jun 24 2010, 07:06 AM']Hi All,

I have installed gpu-mat toolbox from [url="http://www.gp-you.org./index.php"]http://www.gp-you.org./index.php[/url] and, interesingly it contains .m to gpu-enabled mex compiler in it. Yes, a m-script compiler!
And it works on my configuration (in the previous post).

Upon closer examination, the ".m" compiler simply packs a linear list of operations into a .cpp file and then compiles it using plain mex. Not bad, but that's still a series of operations instead of custom optimised kernel.

I am not sure how it works yet, but it must hold the answer. Stay tuned for updates.[/quote]

OK, an update.
The library above doesn't compile cuda at all, it merely links a list of precompiled cuda kernels. Nice, however this is not what i need.
In the end i installed a 32bit version of matlab, 32bit libraries etc. . . . and they work no problem.
I still don't know how to make 64bit versions work . . .

my ultrasonic beam calculator achieves 270x speedup over matlab, and still a 130x speedup over compiled C version ( Gt9500 over i7-720@ 2.8GHz ). Refreshing. Now i have a good excuse to buy the Gf470 :-)
[quote name='vigilant' post='1077271' date='Jun 24 2010, 07:06 AM']Hi All,



I have installed gpu-mat toolbox from http://www.gp-you.org./index.php and, interesingly it contains .m to gpu-enabled mex compiler in it. Yes, a m-script compiler!

And it works on my configuration (in the previous post).



Upon closer examination, the ".m" compiler simply packs a linear list of operations into a .cpp file and then compiles it using plain mex. Not bad, but that's still a series of operations instead of custom optimised kernel.



I am not sure how it works yet, but it must hold the answer. Stay tuned for updates.



OK, an update.

The library above doesn't compile cuda at all, it merely links a list of precompiled cuda kernels. Nice, however this is not what i need.

In the end i installed a 32bit version of matlab, 32bit libraries etc. . . . and they work no problem.

I still don't know how to make 64bit versions work . . .



my ultrasonic beam calculator achieves 270x speedup over matlab, and still a 130x speedup over compiled C version ( Gt9500 over i7-720@ 2.8GHz ). Refreshing. Now i have a good excuse to buy the Gf470 :-)

#3
Posted 06/29/2010 12:57 AM   
[quote name='vigilant' post='1077271' date='Jun 24 2010, 07:06 AM']Hi All,

I have installed gpu-mat toolbox from [url="http://www.gp-you.org./index.php"]http://www.gp-you.org./index.php[/url] and, interesingly it contains .m to gpu-enabled mex compiler in it. Yes, a m-script compiler!
And it works on my configuration (in the previous post).

Upon closer examination, the ".m" compiler simply packs a linear list of operations into a .cpp file and then compiles it using plain mex. Not bad, but that's still a series of operations instead of custom optimised kernel.

I am not sure how it works yet, but it must hold the answer. Stay tuned for updates.[/quote]

OK, an update.
The library above doesn't compile cuda at all, it merely links a list of precompiled cuda kernels. Nice, however this is not what i need.
In the end i installed a 32bit version of matlab, 32bit libraries etc. . . . and they work no problem.
I still don't know how to make 64bit versions work . . .

my ultrasonic beam calculator achieves 270x speedup over matlab, and still a 130x speedup over compiled C version ( Gt9500 over i7-720@ 2.8GHz ). Refreshing. Now i have a good excuse to buy the Gf470 :-)
[quote name='vigilant' post='1077271' date='Jun 24 2010, 07:06 AM']Hi All,



I have installed gpu-mat toolbox from http://www.gp-you.org./index.php and, interesingly it contains .m to gpu-enabled mex compiler in it. Yes, a m-script compiler!

And it works on my configuration (in the previous post).



Upon closer examination, the ".m" compiler simply packs a linear list of operations into a .cpp file and then compiles it using plain mex. Not bad, but that's still a series of operations instead of custom optimised kernel.



I am not sure how it works yet, but it must hold the answer. Stay tuned for updates.



OK, an update.

The library above doesn't compile cuda at all, it merely links a list of precompiled cuda kernels. Nice, however this is not what i need.

In the end i installed a 32bit version of matlab, 32bit libraries etc. . . . and they work no problem.

I still don't know how to make 64bit versions work . . .



my ultrasonic beam calculator achieves 270x speedup over matlab, and still a 130x speedup over compiled C version ( Gt9500 over i7-720@ 2.8GHz ). Refreshing. Now i have a good excuse to buy the Gf470 :-)

#4
Posted 06/29/2010 12:57 AM   
[quote name='vigilant' post='1077104' date='Jun 23 2010, 06:31 PM']Hello All,
I am sitting on this problem for the 3rd day.
I have tried to compile various examples from various places.
The furthest i managed to get is :

test.obj : fatal error LNK1112: module machine type 'X86' conflicts with target machine type 'x64'

I have tried to use compile script from various sources, then edit them myself, i have modified mex.pl and various nvmex.pl versions. and
NOTHING!

Can someone finally explain what the hell does that mean?

My software is:

Windows XP 64bit professional
Visual Studio C++ 2008 Express edition
Matlab 2009a 64bit (standard C to mex works)
Matlab 2010a 64bit (standard C to mex works)
MS Windows SDK v6.1 as described in [url="http://www.mathworks.com/support/solutions/en/data/1-6IJJ3L/index.html?solution=1-6IJJ3L"]http://www.mathworks.com/support/solutions...lution=1-6IJJ3L[/url]
CUDA 3.0 64bit

I am close to give up. please help.
I have my kernel (ultrasonic beam simulation) already written and verified in m-script and plain C code, it works as mexw64. It is already optimised for localized memory access(const memory).
All i need now is a wrapper.

Will provide detailed error messages if neccesary.

please help . . ..[/quote]
[quote name='vigilant' post='1077104' date='Jun 23 2010, 06:31 PM']Hello All,

I am sitting on this problem for the 3rd day.

I have tried to compile various examples from various places.

The furthest i managed to get is :



test.obj : fatal error LNK1112: module machine type 'X86' conflicts with target machine type 'x64'



I have tried to use compile script from various sources, then edit them myself, i have modified mex.pl and various nvmex.pl versions. and

NOTHING!



Can someone finally explain what the hell does that mean?



My software is:



Windows XP 64bit professional

Visual Studio C++ 2008 Express edition

Matlab 2009a 64bit (standard C to mex works)

Matlab 2010a 64bit (standard C to mex works)

MS Windows SDK v6.1 as described in http://www.mathworks.com/support/solutions...lution=1-6IJJ3L

CUDA 3.0 64bit



I am close to give up. please help.

I have my kernel (ultrasonic beam simulation) already written and verified in m-script and plain C code, it works as mexw64. It is already optimised for localized memory access(const memory).

All i need now is a wrapper.



Will provide detailed error messages if neccesary.



please help . . ..

#5
Posted 07/14/2010 07:48 PM   
That sounds like you're linking with the 32-bit CUDA instead of the 64-bit.

Does your mex line look something like

mex('-largeArrayDims', 'YOURFILENAMEHERE', '-LC:\CUDA\lib64', '-lcudart');
That sounds like you're linking with the 32-bit CUDA instead of the 64-bit.



Does your mex line look something like



mex('-largeArrayDims', 'YOURFILENAMEHERE', '-LC:\CUDA\lib64', '-lcudart');

#6
Posted 07/14/2010 07:54 PM   
[quote name='Dittoaway' post='1088020' date='Jul 14 2010, 08:54 PM']That sounds like you're linking with the 32-bit CUDA instead of the 64-bit.

Does your mex line look something like

mex('-largeArrayDims', 'YOURFILENAMEHERE', '-LC:\CUDA\lib64', '-lcudart');[/quote]
i wasn't using 'largeArrayDims' but regarding the lib64, I am pretty sure i had ONLY the lib64 folder at that time so it couldn't link to anything else. I was trying to put that path into various places and it didn't work.
Still, now that i have written my own version of 'compiling script' i might try it again with 64bit matlab. I'll report if i do.
[quote name='Dittoaway' post='1088020' date='Jul 14 2010, 08:54 PM']That sounds like you're linking with the 32-bit CUDA instead of the 64-bit.



Does your mex line look something like



mex('-largeArrayDims', 'YOURFILENAMEHERE', '-LC:\CUDA\lib64', '-lcudart');

i wasn't using 'largeArrayDims' but regarding the lib64, I am pretty sure i had ONLY the lib64 folder at that time so it couldn't link to anything else. I was trying to put that path into various places and it didn't work.

Still, now that i have written my own version of 'compiling script' i might try it again with 64bit matlab. I'll report if i do.

#7
Posted 07/14/2010 11:57 PM   
ok, here is my final solution. Workes fine for WinXP64 bit and 32-bit MATLAB, also works fine on Windows 7 64-bit and 64-bit MATLAB.
Didn't work on WinXP 64 bit and 64-bit Matlab, i don't know why.


[code]% nvc(filename) compile cuda via intermediate .obj file
% '.cu' extension is already added
% in case if nvcc complains about some .h or .lim files, simply add them to
% the "nvcc.profile" file where they belong
% regarding the options, simply comment or uncomment relevant line as
% needed

% 2010-07 Jerzy Dziewierz, CUE, Strathclyde University
% Public domain, but please keep this short comment

function nvc(filename)
% make cuda .obj file first
options='-gencode=arch=compute_10,code=sm_10 -gencode=arch=compute_10,code=compute_10 -gencode=arch=compute_20,code=sm_20 -gencode=arch=compute_20,code=compute_20';
options=[options ' --use_fast_math'];
%options=[options ' -keep'];
txt=sprintf('c:\\MATLAB2010a\\CUDA\\bin64\\nvcc %s.cu %s -c -lcufft -lcudart -lcuda --ptxas-options=-v -Ic:\\MATLAB2010a\\extern\\include\\',filename,options);
system(txt)

%mex_options='-g'; % to include debug info
mex_options='-O'; % enable optimisation
n=getenv('CUDA_LIB_PATH');
mex(['-L' n],mex_options,'-lcudart','-lcufft','-lcuda',sprintf('%s.obj',filename));
delete(sprintf('%s.obj',filename));[/code]

usage: (assume that you have example.cu )

>>nvc example

no need to be in your code folder, works from anywhere within Matlab's path.
ok, here is my final solution. Workes fine for WinXP64 bit and 32-bit MATLAB, also works fine on Windows 7 64-bit and 64-bit MATLAB.

Didn't work on WinXP 64 bit and 64-bit Matlab, i don't know why.





% nvc(filename) compile cuda via intermediate .obj file

% '.cu' extension is already added

% in case if nvcc complains about some .h or .lim files, simply add them to

% the "nvcc.profile" file where they belong

% regarding the options, simply comment or uncomment relevant line as

% needed



% 2010-07 Jerzy Dziewierz, CUE, Strathclyde University

% Public domain, but please keep this short comment



function nvc(filename)

% make cuda .obj file first

options='-gencode=arch=compute_10,code=sm_10 -gencode=arch=compute_10,code=compute_10 -gencode=arch=compute_20,code=sm_20 -gencode=arch=compute_20,code=compute_20';

options=[options ' --use_fast_math'];

%options=[options ' -keep'];

txt=sprintf('c:\\MATLAB2010a\\CUDA\\bin64\\nvcc %s.cu %s -c -lcufft -lcudart -lcuda --ptxas-options=-v -Ic:\\MATLAB2010a\\extern\\include\\',filename,options);

system(txt)



%mex_options='-g'; % to include debug info

mex_options='-O'; % enable optimisation

n=getenv('CUDA_LIB_PATH');

mex(['-L' n],mex_options,'-lcudart','-lcufft','-lcuda',sprintf('%s.obj',filename));

delete(sprintf('%s.obj',filename));




usage: (assume that you have example.cu )



>>nvc example



no need to be in your code folder, works from anywhere within Matlab's path.

#8
Posted 08/01/2010 08:50 PM   
Hey all, now that MATLAB supports CUDA kernals directly, many of you have probably given up on using the nvmex compiler.

However, for portability issues, I still think it is best to write your own stand-alone mex file with CUDA code.

I have been using nvmex for my research and have the most up-to-date nvmex with setup instructions at www.nlsemagic.com.
Hey all, now that MATLAB supports CUDA kernals directly, many of you have probably given up on using the nvmex compiler.



However, for portability issues, I still think it is best to write your own stand-alone mex file with CUDA code.



I have been using nvmex for my research and have the most up-to-date nvmex with setup instructions at www.nlsemagic.com.

#9
Posted 11/10/2011 11:31 PM   
Scroll To Top