Hi,
I’m trying to understand he pipeline by which a decoded video frame is displayed.
My understanding is:
-
cuvidMapVideoFrame uses pictureIndex of decoded frame to fetch device pointer (pDecodedFrame[active_field]) and pitch
-
CUDA kernel is launched to convert NV12 pDecodedFrame[active_field] to pre-allocated g_pRgba device array
-
cuMemcpy2D is used to copy g_pRgba to g_backBufferArray that is mapped to pTexture_[active_field] (which is a 2DTexture)
-
context->CopyResource is used to copy from pTexture_[active_field] to pBackBuffer
(which is buffer 0 of the swap chain). -
The swap chain then presents the next buffer
I don’t understand why steps 3 & 4 are necessary. Why can’t the CUDA kernel write directly to the target back buffer/texture? Seem’s like unnecessary copying. I’m sure there’s a good reason, I’m just new to CUDA and Direct3D.
It’d be great if there was a bare bones C video decode D3D11 example.
All the c++ object orientation makes it difficult to see the essential sequence and flow of data between the key API calls.
Cheers, Wayne.