Example of Matrix Multiplication with Shared memory
Hi ! I'm totally new with CUDA. I've read the CUDA C Programming guide (cuda 4.0) and I found a part (3.2.3) which described Shared Memory through Matrix Multiplication. However I don't get how to use the stride efficiently.
This was the struct used:

// M(row,col)=*(M.elements + row *M.stride + col)
typedef struct {
int width;
int height;
int stride;
float* elements;
} Matrix;

Is it possible to find a main which used this struct and the kernels proposed in this document ? The samples provided don't use this struct. Something not to difficult to understand (without safeCall...) and using CUDA runtime, I'm still a student.

I don't know how to choose parameters well enough (width, height, SIZE_BLOCK...) to get good performances with GPU. I've got a GPU with 2.1 compute capability. If you need further information, please do not hesitate !
Hi ! I'm totally new with CUDA. I've read the CUDA C Programming guide (cuda 4.0) and I found a part (3.2.3) which described Shared Memory through Matrix Multiplication. However I don't get how to use the stride efficiently.

This was the struct used:



// M(row,col)=*(M.elements + row *M.stride + col)

typedef struct {

int width;

int height;

int stride;

float* elements;

} Matrix;



Is it possible to find a main which used this struct and the kernels proposed in this document ? The samples provided don't use this struct. Something not to difficult to understand (without safeCall...) and using CUDA runtime, I'm still a student.



I don't know how to choose parameters well enough (width, height, SIZE_BLOCK...) to get good performances with GPU. I've got a GPU with 2.1 compute capability. If you need further information, please do not hesitate !

#1
Posted 06/21/2011 03:19 PM   
[quote name='Frstdies' date='21 June 2011 - 10:19 AM' timestamp='1308669580' post='1254783']
Hi ! I'm totally new with CUDA. I've read the CUDA C Programming guide (cuda 4.0) and I found a part (3.2.3) which described Shared Memory through Matrix Multiplication. However I don't get how to use the stride efficiently.
This was the struct used:

// M(row,col)=*(M.elements + row *M.stride + col)
typedef struct {
int width;
int height;
int stride;
float* elements;
} Matrix;

Is it possible to find a main which used this struct and the kernels proposed in this document ? The samples provided don't use this struct. Something not to difficult to understand (without safeCall...) and using CUDA runtime, I'm still a student.

I don't know how to choose parameters well enough (width, height, SIZE_BLOCK...) to get good performances with GPU. I've got a GPU with 2.1 compute capability. If you need further information, please do not hesitate !
[/quote]


I was trying to understand that example just 2 weeks ago I guess. I wrote a main function for it. This code is not perfect sample to show the performance of CUDA but it can help you to understand it. Good Luck!

[attachment=21480:matrix_shared.cu]
[quote name='Frstdies' date='21 June 2011 - 10:19 AM' timestamp='1308669580' post='1254783']

Hi ! I'm totally new with CUDA. I've read the CUDA C Programming guide (cuda 4.0) and I found a part (3.2.3) which described Shared Memory through Matrix Multiplication. However I don't get how to use the stride efficiently.

This was the struct used:



// M(row,col)=*(M.elements + row *M.stride + col)

typedef struct {

int width;

int height;

int stride;

float* elements;

} Matrix;



Is it possible to find a main which used this struct and the kernels proposed in this document ? The samples provided don't use this struct. Something not to difficult to understand (without safeCall...) and using CUDA runtime, I'm still a student.



I don't know how to choose parameters well enough (width, height, SIZE_BLOCK...) to get good performances with GPU. I've got a GPU with 2.1 compute capability. If you need further information, please do not hesitate !







I was trying to understand that example just 2 weeks ago I guess. I wrote a main function for it. This code is not perfect sample to show the performance of CUDA but it can help you to understand it. Good Luck!



[attachment=21480:matrix_shared.cu]
Attachments

matrix_shared.cu

#2
Posted 06/21/2011 04:10 PM   
Thanks !! It helps me a lot !
Thanks !! It helps me a lot !

#3
Posted 06/22/2011 06:44 AM   
Scroll To Top