creation of dynamic arrays for shared memory

Hello,
I have a query regarding the creation and usage of dynamically allocated arrays in shared memory for CUDA.
i have two arrays xmap and ymap which have a size of gradcol + 1 and gradRow +1 respectively ,so i am evoking cuda kernel with following syntax:
Computegradientcuda<<<tblocks,threadsperblock,(gradRow + gradCol + 2)* sizeof(int)>>>(dev_mag_ptr,dev_ang_ptr,dev_src_ptr,colSrc,rowSrc,ChnSrc,gradCol,gradRow,gradChn,scale,bins,prows,pcols,xmap,ymap);
I need to copy these arrays (xmap and ymap ) inside kernel into shared memory.
The kernel code goes as follows:
extern shared int s;
int *x_map = s;
int *y_map = (int *)&x_map[gradCol +1];

if(threadIdx.x == 0)

{
for ( int x = -1 ; x < gradCol +1 ;x++)

{
   x_map[x] = xmap[x];
  printf("x_map[-1] = %d\n",x_map[-1]);
}

}

for ( int y = -1 ; y < gradRow +1 ;y++)
{
   y_map[y] = ymap[y];
}

__syncthreads();

my code hangs at this point. Can anyone let me know what am i doing wrong here?