Batched solver code available
  1 / 2    
The source code for an efficient solver and matrix inversion for small matrices using partial pivoting, is now available to all registered developers.

On sm_2x GPUs (48KB shared memory), the maximum dimensions for the matrices are:
       DP           solver, non-batched matrix inverse:  2...76
       DP complex   solver, non-batched matrix inverse:  2...53

       DP           batched matrix inverse:              2...77
       DP complex   batched matrix inverse:              2...55


The code has been released under BSD license.

It is available from the CUDA registered developer web site:
https://nvdeveloper.nvidia.com
The source code for an efficient solver and matrix inversion for small matrices using partial pivoting, is now available to all registered developers.



On sm_2x GPUs (48KB shared memory), the maximum dimensions for the matrices are:

       DP           solver, non-batched matrix inverse:  2...76

       DP complex   solver, non-batched matrix inverse:  2...53



       DP           batched matrix inverse:              2...77

       DP complex   batched matrix inverse:              2...55





The code has been released under BSD license.



It is available from the CUDA registered developer web site:

https://nvdeveloper.nvidia.com

#1
Posted 09/21/2011 02:29 PM   
Thank you for making this code available.

Is there some simple way to make this code run using single precision complex numbers?

I'm currently working on a batched solver my self for small (dim 2-32) positive definite systems using Cholesky decomposition.
Is there any plans to extend this code to support these kind of systems?
Thank you for making this code available.



Is there some simple way to make this code run using single precision complex numbers?



I'm currently working on a batched solver my self for small (dim 2-32) positive definite systems using Cholesky decomposition.

Is there any plans to extend this code to support these kind of systems?

#2
Posted 09/28/2011 09:42 AM   
The code uses templates, it should be very easy to change to single precision complex numbers
The code uses templates, it should be very easy to change to single precision complex numbers

#3
Posted 09/28/2011 02:28 PM   
Thanks, but I was more thinking about all this config optimization.
Complex float would have the same config as double, so I just copy-paste that one.
Will the same config be optimal for non-complex float as well? With float you could fit more into shared memory, and therefore support larger matrices, so some changes to the launch config must be done.
Thanks, but I was more thinking about all this config optimization.

Complex float would have the same config as double, so I just copy-paste that one.

Will the same config be optimal for non-complex float as well? With float you could fit more into shared memory, and therefore support larger matrices, so some changes to the launch config must be done.

#4
Posted 09/29/2011 11:18 AM   
nice, thats very useful stuff for many applications
nice, thats very useful stuff for many applications

#5
Posted 10/07/2011 11:43 AM   
[quote name='mfatica' date='21 September 2011 - 10:29 PM' timestamp='1316615386' post='1296427']
The source code for an efficient solver and matrix inversion for small matrices using partial pivoting, is now available to all registered developers.

On sm_2x GPUs (48KB shared memory), the maximum dimensions for the matrices are:
       DP           solver, non-batched matrix inverse:  2...76
       DP complex   solver, non-batched matrix inverse:  2...53

       DP           batched matrix inverse:              2...77
       DP complex   batched matrix inverse:              2...55


The code has been released under BSD license.

It is available from the CUDA registered developer web site:
https://nvdeveloper.nvidia.com
[/quote]
Hi!
I'm a fortran programmer. I call batched solver through fortran's interface, and I think it works perfectly. But I wonder if you can provide a batched solver which can solve least square problem. That will be helpful to me. Also I want you add single precision suport, that will be perfect!
[quote name='mfatica' date='21 September 2011 - 10:29 PM' timestamp='1316615386' post='1296427']

The source code for an efficient solver and matrix inversion for small matrices using partial pivoting, is now available to all registered developers.



On sm_2x GPUs (48KB shared memory), the maximum dimensions for the matrices are:

       DP           solver, non-batched matrix inverse:  2...76

       DP complex   solver, non-batched matrix inverse:  2...53



       DP           batched matrix inverse:              2...77

       DP complex   batched matrix inverse:              2...55





The code has been released under BSD license.



It is available from the CUDA registered developer web site:

https://nvdeveloper.nvidia.com



Hi!

I'm a fortran programmer. I call batched solver through fortran's interface, and I think it works perfectly. But I wonder if you can provide a batched solver which can solve least square problem. That will be helpful to me. Also I want you add single precision suport, that will be perfect!

#6
Posted 12/02/2011 10:36 AM   
I can't view the code, but i'm alredy registered on Nvidia developer Zone, but i can't sign in on the link you provide. Where can I do the registration?
I can't view the code, but i'm alredy registered on Nvidia developer Zone, but i can't sign in on the link you provide. Where can I do the registration?

#7
Posted 03/07/2012 01:42 AM   
Here is the process I followed when I signed up recently. It sounds like you already completed the first step.

1) Get a DevZone ID
a. http://developer.nvidia.com Top Right either new account or if you have a devzone account just login.

2) Once logged in go to “my account” top right
a. Complete the Basic Profile, and then complete the CUDA registered developer form.

3) You will receive email confirmations and will usually be approved within one business day

4) Once approved the CUDA registered developer program home page is accessible via the “My Account” - the program name will be green and is a hyperlink.
Here is the process I followed when I signed up recently. It sounds like you already completed the first step.



1) Get a DevZone ID

a. http://developer.nvidia.com Top Right either new account or if you have a devzone account just login.



2) Once logged in go to “my account” top right

a. Complete the Basic Profile, and then complete the CUDA registered developer form.



3) You will receive email confirmations and will usually be approved within one business day



4) Once approved the CUDA registered developer program home page is accessible via the “My Account” - the program name will be green and is a hyperlink.

#8
Posted 03/07/2012 03:05 AM   
[quote name='mfatica' date='22 September 2011 - 01:29 AM' timestamp='1316615386' post='1296427']
The source code for an efficient solver and matrix inversion for small matrices using partial pivoting, is now available to all registered developers.

On sm_2x GPUs (48KB shared memory), the maximum dimensions for the matrices are:
       DP           solver, non-batched matrix inverse:  2...76
       DP complex   solver, non-batched matrix inverse:  2...53

       DP           batched matrix inverse:              2...77
       DP complex   batched matrix inverse:              2...55


The code has been released under BSD license.

It is available from the CUDA registered developer web site:
https://nvdeveloper.nvidia.com
[/quote]
Hi I have registered as a Basic Registered Developer, I have registered for this form and I have applied for the CUDA/GPU Computing Registered Developer Program. When I click on the link given above; https://nvdeveloper.nvidia.com , I get to a login screen but it does not recognise my login or password. After trying several times it appears I am being blocked, I cannot see the login page, I am now getting "The connection has timed out , The server at nvdeveloper.nvidia.com is taking too long to respond."
[quote name='mfatica' date='22 September 2011 - 01:29 AM' timestamp='1316615386' post='1296427']

The source code for an efficient solver and matrix inversion for small matrices using partial pivoting, is now available to all registered developers.



On sm_2x GPUs (48KB shared memory), the maximum dimensions for the matrices are:

       DP           solver, non-batched matrix inverse:  2...76

       DP complex   solver, non-batched matrix inverse:  2...53



       DP           batched matrix inverse:              2...77

       DP complex   batched matrix inverse:              2...55





The code has been released under BSD license.



It is available from the CUDA registered developer web site:

https://nvdeveloper.nvidia.com



Hi I have registered as a Basic Registered Developer, I have registered for this form and I have applied for the CUDA/GPU Computing Registered Developer Program. When I click on the link given above; https://nvdeveloper.nvidia.com , I get to a login screen but it does not recognise my login or password. After trying several times it appears I am being blocked, I cannot see the login page, I am now getting "The connection has timed out , The server at nvdeveloper.nvidia.com is taking too long to respond."

#9
Posted 04/20/2012 11:07 PM   
There are two registered developer websites, the old site and the new site. Best I am aware, they do not share login information. So the problem may simply be a mismatch between login information and website.

To access the code via the new registered developer website:

(1) Go to http://developer.nvidia.com/
(2) On right hand side, click on green link "Registered Developers Website"
(3) Log in or create new account as needed
(4) click on green link "CUDA/GPU Computing Registered Developer Program"
(5) Scroll down to section "CUDA Batch Solver"
(6) Click on green link "follow this link"
(7) Click green "I ACCEPT" at usage agreement
(8) your download should start

To access the code via the old registered developer website:

(1) Go to http://partners.nvidia.com
(2) Sign in with email address and password
(3) There is a menu on the right hand side titled "Newest Downloads"
(4) Click on link "Batched Solver"
(5) Click on link "download"
(6) Click "Accept" button below the usage agreement
(7) your downoad should start
There are two registered developer websites, the old site and the new site. Best I am aware, they do not share login information. So the problem may simply be a mismatch between login information and website.



To access the code via the new registered developer website:



(1) Go to http://developer.nvidia.com/

(2) On right hand side, click on green link "Registered Developers Website"

(3) Log in or create new account as needed

(4) click on green link "CUDA/GPU Computing Registered Developer Program"

(5) Scroll down to section "CUDA Batch Solver"

(6) Click on green link "follow this link"

(7) Click green "I ACCEPT" at usage agreement

(8) your download should start



To access the code via the old registered developer website:



(1) Go to http://partners.nvidia.com

(2) Sign in with email address and password

(3) There is a menu on the right hand side titled "Newest Downloads"

(4) Click on link "Batched Solver"

(5) Click on link "download"

(6) Click "Accept" button below the usage agreement

(7) your downoad should start

#10
Posted 04/21/2012 12:35 AM   
Anyone succeeded to convert this to single precision (float) ? The main problem seems to be the config class, which I'm not sure how to define it for float. Thanks.
Anyone succeeded to convert this to single precision (float) ?

The main problem seems to be the config class, which I'm not sure how to define it for float.

Thanks.

#11
Posted 03/28/2013 09:26 AM   
[quote="rm9"]Anyone succeeded to convert this to single precision (float) ? The main problem seems to be the config class, which I'm not sure how to define it for float. Thanks.[/quote] If someone has a float/complex version, I'm also very interested!
rm9 said:Anyone succeeded to convert this to single precision (float) ?

The main problem seems to be the config class, which I'm not sure how to define it for float.

Thanks.


If someone has a float/complex version, I'm also very interested!

#12
Posted 05/17/2013 02:29 PM   
Do you need the float/complex solver, or float/complex matrix inverse? By the way, there are no dark secrets for finding the data needed for the config class, but it takes time to run necessary experiments to find the best configuration.
Do you need the float/complex solver, or float/complex matrix inverse? By the way, there are no dark secrets for finding the data needed for the config class, but it takes time to run necessary experiments to find the best configuration.

#13
Posted 05/17/2013 03:39 PM   
There is a version of Matrix Inversion which I use for convex optimization problems. Have tested it on matrices as large as (4,000 by 4,000) and it works well on dense matrices. Also libraries like CULA or MAGMA allow you to solve for it. Not sure about a complex version though.
There is a version of Matrix Inversion which I use for convex optimization problems. Have tested it on matrices as large as (4,000 by 4,000) and it works well on dense matrices.

Also libraries like CULA or MAGMA allow you to solve for it.

Not sure about a complex version though.

#14
Posted 05/17/2013 06:10 PM   
For batches of small matrices, CUBLAS offers getrfBatched to compute the LU decomposition and trsmBatched to solve triangular systems.
For batches of small matrices, CUBLAS offers getrfBatched to compute the LU decomposition and trsmBatched to solve triangular systems.

#15
Posted 05/17/2013 06:44 PM   
  1 / 2    
Scroll To Top