D3D11 With NVDEC_VideoDecoder

I am studying “media sdk” and “nvdec” to write a video palyer using d3d11.
Glad to tell you, msdk does not work on windows7 with d3d11, but nvdec performs perfectly.
But when I turn to windows 10, I work out a brilliant idea to create an “id3d11tex2d” with nv12 format
This time, media sdk performs 1000 fps(render mode 1080p), faster than nvdec 500fps.
I guess the “nv12torgba” kernel function is to blame.
So I want to konw how to create register nv12 “id3d11tex2d” with cuda resource.
And How to copy from decode data to the cuda resource,
Because the width and pitch for nv12 format is hard to describe, there are 2 planes.
Can you write a demo?

Another quetion:
“cuGraphicsSubResourceGetMappedArray” works well and i can get a cuarray.
but “cuGraphicsResourceGetMappedPointer” fails and return CUDA_ERROR_INVALID_HANDLE.

Another quetion:
“cuGraphicsSubResourceGetMappedArray” works well and i can get a cuarray.
but “cuGraphicsResourceGetMappedPointer” fails and return CUDA_ERROR_INVALID_HANDLE.