hi
I am testing a collider using Optix, and noticed a huge bottleneck due
to several under the hood deallocations/allocations and Host-Device
transfers…
This happens everytime I launch the tracer and optix seems to be changing contexts.
Here,
it is pointed out that Optix and CUDA context can not share memory in
the actual GPU hardware, leading to the copying.
He mentions a workaround using an OpenGL buffer, but is this the only way?
this is how I am creating the buffers:
optix::Context context;
optix::Buffer buffer1, buffer2, buffer3;
void init()
{
context = optix::Context::create();
//...
//...
buffer1 = context->createBuffer(RT_BUFFER_INPUT, RT_FORMAT_UNSIGNED_INT);
context["buffer1"]->setBuffer(buffer1);
buffer2 = context->createBuffer(RT_BUFFER_OUTPUT, RT_FORMAT_FLOAT4);
context["buffer2"]->setBuffer(buffer2);
buffer2->setSize(size2);
buffer3 = context->createBuffer(RT_BUFFER_INPUT_OUTPUT, RT_FORMAT_FLOAT);
context["buffer3"]->setBuffer(buffer3);
buffer3->setSize(size3);
}
//and this is how I get/set the data:
void update(unsigned int *usrData1, float *userData3, unsigned int size3)
{
buffer1->setDevicePointer(0,usrData1);
cudaMemcpyAsync(buffer3->getDevicePointer(0), userData3,
sizeof(float)*size3, cudaMemcpyDeviceToDevice)
}
float * result(){
return (float*)buffer2->getDevicePointer(0);
}
thank you.