No, because the size must be known at compile time.
What you can do is use the dynamical shared memory allocation feature. Declare a_d as follows:
[font=“Courier New”]extern shared float a_d;[/font]
and add the required size as third configuration parameter of the kernel invocation:
[font=“Courier New”]my_kernel<<<gridsize, blocksize, blocksize.xblocksize.yblocksize.z*sizeof(float)>>>();[/font]
Note that this only works for one variable size array.
No, because the size must be known at compile time.
What you can do is use the dynamical shared memory allocation feature. Declare a_d as follows:
[font=“Courier New”]extern shared float a_d;[/font]
and add the required size as third configuration parameter of the kernel invocation:
[font=“Courier New”]my_kernel<<<gridsize, blocksize, blocksize.xblocksize.yblocksize.z*sizeof(float)>>>();[/font]
Note that this only works for one variable size array.
Hmmmm… continuation of the equation. Something is still not right.
so, I set in my main code,
dim3 dimGrid2(129,129),dimBlock2(1,1,33);
Kernel<<<dimGrid2,dimBlock2,33*sizeof(REAL)>>>
…
global void Kernel(REAL *COEF,REAL *P,REAL *PN,REAL *RHS,
int imax,int jmax,int kmax, REAL *cost)
{
shared REAL COEFs0;
__shared__ REAL COEFs1[];
__shared__ REAL COEFs2[];
}
so that i can dynamically assign same size of arrays in the kernel.
Oh. I didn’t find the third parameter usage in the book “Programming Massively Parallel Processors” but I looked up again in documents from NVIDIA and now I know there is even a fourth parameter you can set… Thanks very much!
Hmmmm… continuation of the equation. Something is still not right.
so, I set in my main code,
dim3 dimGrid2(129,129),dimBlock2(1,1,33);
Kernel<<<dimGrid2,dimBlock2,33*sizeof(REAL)>>>
…
global void Kernel(REAL *COEF,REAL *P,REAL *PN,REAL *RHS,
int imax,int jmax,int kmax, REAL *cost)
{
shared REAL COEFs0;
__shared__ REAL COEFs1[];
__shared__ REAL COEFs2[];
}
so that i can dynamically assign same size of arrays in the kernel.
Oh. I didn’t find the third parameter usage in the book “Programming Massively Parallel Processors” but I looked up again in documents from NVIDIA and now I know there is even a fourth parameter you can set… Thanks very much!