problem with nppiFilter_8u how to use nppiFilter_8u

hi every one,

I need to use nppiFilter_8u in my application. as i see, this function executes an square filter on an image. but in act, when i use an square filter, the output of function is just noise.
I changed the dementional of filter and underestand, by using 1 dimentional filter it would work true on just square images, but on non-square image it would shifted up and the omitted part of image will be appeared in the button of output filtered image.

image and output and filter memories are allocated on GPU by “nppiMalloc_8u_C1” from Npp librery and the data (image and filter) are copy to GPU by “cudaMemcpy2D” from CUDA.

filter is 3x3:
[1,2,1
0,0,0
-1,-2,-1]

image is 256x256: char type (uint 8)

output is 256x256: char type (uint 8)

please help me.
thanks.

The kernels do not automatically handle border data expansion. Thus your ROI needs to be reduced according to kernel size.

The kernels do not automatically handle border data expansion. Thus your ROI needs to be reduced according to kernel size.