to transfer the YV12 pixels to a unsigned char* buffer which I then pass to the EncodeFrame function with
something like this:
efparams.picBuf =buf;
HRESULT hr = NVEncodeFrame(pCudaEncoder,&efparams,0,NULL);
Since the data is on the GPU anyways I thought I would get a speedup if I use the NVVE_DEVICE_MEMORY_INPUT flag.
and pass the d_buf directly as to the NVEncodeFrame as the pData as described in the docs.
HRESULT hr = NVEncodeFrame(pCudaEncoder,&efparams,0,d_buf);
But I only get rubbish then. Any ideas?
It seems that I haven’t set the context right. Somewhere buried down in the doc I found this:
“Device Context Lock parameter must also be set if device memory input is enabled. Context lock should be created from cuvidCtxLockCreate API available in NVCUVID.”
I am not familiar with the Driver Api. How do I get the current context in the runtime API, so I can pass it correctly to the Encoder?
It seems that I haven’t set the context right. Somewhere buried down in the doc I found this:
“Device Context Lock parameter must also be set if device memory input is enabled. Context lock should be created from cuvidCtxLockCreate API available in NVCUVID.”
I am not familiar with the Driver Api. How do I get the current context in the runtime API, so I can pass it correctly to the Encoder?
this crashes. When I comment out the line with SetParamValue(encoder->pCudaEncoder,NVVE_DEVICE_CTX_LOCK,g_CtxLock ) it doesnt crash, but the memory passed to the encoder is not the one that I filled then.
this crashes. When I comment out the line with SetParamValue(encoder->pCudaEncoder,NVVE_DEVICE_CTX_LOCK,g_CtxLock ) it doesnt crash, but the memory passed to the encoder is not the one that I filled then.
cuvidCtxLocCreate() takes a “CUcontext” as 2nd argument. You are not passing it right. Did you not get a warning about it?
SetParamValue() function has a problem. It is implemented in the APP only and it is not a library call.
Just disable the “printing” done in this function. It has a bug. It indexes past the array while printing. Just fix it (hope some1 from NV is listening)
It overflows for the DEVICE_MEM_INPUT(44) and the 45th one. NV needs to fix it. Well, Technically it will work for their APP…Upto them.
cuvidCtxLocCreate() takes a “CUcontext” as 2nd argument. You are not passing it right. Did you not get a warning about it?
SetParamValue() function has a problem. It is implemented in the APP only and it is not a library call.
Just disable the “printing” done in this function. It has a bug. It indexes past the array while printing. Just fix it (hope some1 from NV is listening)
It overflows for the DEVICE_MEM_INPUT(44) and the 45th one. NV needs to fix it. Well, Technically it will work for their APP…Upto them.
cuInit() is required for “NVCreateHWEncoder()” to succeed. Otherwise, the context that you create actually fails.
Before cudaMalloc and cudaMemcpy – One needs to lock and unlock the context.
It is better to pop-off the created context (as u had suggested) to make it floating. We think this does not matter at all. Because “ctxLock()” locks and pushes the related context automatically. but better we have it in place.
After all these, we got the API calls succeeding…But we still get “black” frames… However, even the original code (that does NOT use device mem input) too gives us “Black” frames only…
cuInit() is required for “NVCreateHWEncoder()” to succeed. Otherwise, the context that you create actually fails.
Before cudaMalloc and cudaMemcpy – One needs to lock and unlock the context.
It is better to pop-off the created context (as u had suggested) to make it floating. We think this does not matter at all. Because “ctxLock()” locks and pushes the related context automatically. but better we have it in place.
After all these, we got the API calls succeeding…But we still get “black” frames… However, even the original code (that does NOT use device mem input) too gives us “Black” frames only…
I think I got the cucontext stuff right now, at least there are no crashes. (I also changed the printing code you mentioned )I have set the NVVE_DEVICE_CTX_LOCK option successfully without any crashes so far. Also encoding several videos in parallel in different threads works. Using host memory everything works as expected here. However when the NVVE_DEVICE_MEMORY_INPUT is set, I get black frames. May be the doc isnt right about passing the pixels as the last argument in the NVEncodeFrame(pCudaEncoder,&efparams,0,d_buf)?
Or maybe it is not expecting YV12 but RGB in this case?
I think I got the cucontext stuff right now, at least there are no crashes. (I also changed the printing code you mentioned )I have set the NVVE_DEVICE_CTX_LOCK option successfully without any crashes so far. Also encoding several videos in parallel in different threads works. Using host memory everything works as expected here. However when the NVVE_DEVICE_MEMORY_INPUT is set, I get black frames. May be the doc isnt right about passing the pixels as the last argument in the NVEncodeFrame(pCudaEncoder,&efparams,0,d_buf)?
Or maybe it is not expecting YV12 but RGB in this case?
Oh! I am surprised to see that the original SDK code worked fine for you.
Because, I am getting only black frames out here…even without DEVICE_MEM_INPUT
I also see that the binary for “cudaEncode” is “not” shipped in the “bin” directory of the SDK. I thought they did this on purpose.
I am using the 3.2RC SDK code. Which one are you using? I would be interested to know your configuration.
This is what I am using:
VIDEO_SOURCE_FILE “plush_480p_60fr.yuv”
VIDEO_CONFIG_FILE “704x480-h264.cfg”
VIDEO_OUTPUT_FILE “plush_480p_60fr.264”
Format is YV12
Oh! I am surprised to see that the original SDK code worked fine for you.
Because, I am getting only black frames out here…even without DEVICE_MEM_INPUT
I also see that the binary for “cudaEncode” is “not” shipped in the “bin” directory of the SDK. I thought they did this on purpose.
I am using the 3.2RC SDK code. Which one are you using? I would be interested to know your configuration.
This is what I am using:
VIDEO_SOURCE_FILE “plush_480p_60fr.yuv”
VIDEO_CONFIG_FILE “704x480-h264.cfg”
VIDEO_OUTPUT_FILE “plush_480p_60fr.264”
Format is YV12