Cuda and texture array cannot get my GL_TEXTURE_2D_ARRAY to work with CUDA

heyhey,

this really ruined my weekend and I’m probably missing the next deadline…

I wanted to store all my shadows in a texture array. First: CUDA does not accept GL_DEPTH… textures. So, I copied the whole array layer by layer to a R32F target.

Further, I need to access the texture in CUDA now and I have problems to get it to work. I do:

cutilSafeCall( cudaGraphicsGLRegisterImage( &m_pCudaResource, GetOpenGLID(), GL_TEXTURE_2D_ARRAY, cudaGraphicsRegisterFlagsNone ) ); // works

I read in the CUDA C Programming Guide, Chapter 3 “Layered Textures”: “A layered texture can only be bound to a CUDA array created by calling cudaMalloc3DArray() with the cudaArrayLayered flag (and a height of zero for one-dimensional layered texture)”.

First, I ignored that and did

cutilSafeCall( cudaGraphicsMapResources( 1, &m_pCudaResource, 0 ) );

cutilSafeCall( cudaGraphicsSubResourceGetMappedArray( &pMappedPtr, m_pCudaResource, 0, 0 ) );

But in the CUDA kernel I always got only the first layer when reading with “tex2DLayered” regardless of the 4th parameter…

So I needed another copy of my depth textures :(

// some code left out for clarity...

cutilSafeCall( cudaMalloc3DArray( &m_pCudaMemory, &channelDesc, make_cudaExtent( GetWidth(), GetHeight(), GetNumLayers() ), cudaArrayLayered ) );

Now, each frame I will need to copy my dummy OpenGL Texture array to the cuda allocated memory:

// mapping:

cutilSafeCall( cudaGraphicsMapResources( 1, &m_pCudaResource, 0 ) );

cudaArray* pMappedPtr = 0;

cutilSafeCall( cudaGraphicsSubResourceGetMappedArray( &pMappedPtr, m_pCudaResource, 0, 0 ) );

// copying:

cudaMemcpy3DParms copyParams = { 0 };

copyParams.srcPos = make_cudaPos( 0, 0, 0 ); 

copyParams.dstPos = make_cudaPos( 0, 0, 0 ); 

copyParams.srcArray = pMappedPtr; 

copyParams.dstArray = m_pCudaMemory;

copyParams.extent = make_cudaExtent( GetWidth(), GetHeight(), GetNumLayers() );

copyParams.kind = cudaMemcpyDeviceToDevice;

cutilSafeCall( cudaMemcpy3D( &copyParams ) ); // <--- invalid argument here

Unfortunately, this only results in an “invalid argument”.

I could not find any decent example with device to device copy for GL_TEXTURE_2D_ARRAY mapped textures…

Can anybody help me out? Please, I’m desperate.

I guess that the cudaGraphicsSubResourceGetMappedArray will only map exactly 1 layer, so a slice by slice copy piece of code would be appreciated, too (although 3D whole copy would be better)

Thanks in advance

zqueezy

that’s so typical… once you ask a question that goes beyond “why is my malloc not working”, you won’t get an answer.
In my opinion this “new feature” of layered textures is completely useless if it’s incompatible with GL_TEXTURE_2D_ARRAY.
Thanks NVIDIA you made my day

Actually, once a question involves both OpenGL and CUDA at the same time, it narrows the number of experts who could answer it rather significantly.