Hi all,
I’m encountering excessive setup times when baking per vertex AO.
Basically I have a camera set to read from a vertex and normal buffer to get the positions of each vertex.
The output is a 1D buffer with an entry per vertex.
Setup time is measured by setting up the pipeline and then calling launch with a 0 size output.
I am seeing an expected large setup time (187s) for the first render.
However, a subsequent render is taking 42ms to set up.
Here’s the output:
Dummy launch took 187.112717s
Launch took 0.118060s for 270 vertices
Dummy launch took 41.918648s
Launch took 0.095409s for 270 vertices
Notice that the actual render is really fast, and the reason I’m using Optix for this instead of a “cube map per vertex” style AO render, but those setup times are killing me.
Here’s the code:
void StaticLightOptix::RenderVertexAO(CMeshRenderEntity* pMeshInst)
{
FlushD3D();
// grab the mesh ptr
CMesh const * pMesh = pMeshInst->GetMeshPtr(0);
uint32 numV = pMesh->GetMeshBuffers().mVertexCount;
uint32 totalV = pMeshInst->CountTotalVertices();
// create and bind an output buffer
optix::Buffer buffer = mOptixContext->createBuffer(RT_BUFFER_OUTPUT, RT_FORMAT_FLOAT, totalV);
mOptixContext["outAOBuffer"]->setBuffer(buffer);
{
// bind mesh parameters
optix::Geometry geo = AddMesh(pMesh);
mOptixContext["vertex_buffer"]->setBuffer(geo["vertex_buffer"]->getBuffer());
mOptixContext["normal_buffer"]->setBuffer(geo["normal_buffer"]->getBuffer());
}
{
// setup parameters
mOptixContext["sqrt_occlusion_samples"]->setInt(6);
mOptixContext["occlusion_distance"]->setFloat(4.f);
// grab the transform for this mesh
const CMatrix34* pTransform = pMeshInst->GetTransformPtr();
mOptixContext["matrix_row_0"]->setFloat(pTransform->Get00(), pTransform->Get10(), pTransform->Get20(), pTransform->Get30());
mOptixContext["matrix_row_1"]->setFloat(pTransform->Get01(), pTransform->Get11(), pTransform->Get21(), pTransform->Get31());
mOptixContext["matrix_row_2"]->setFloat(pTransform->Get02(), pTransform->Get12(), pTransform->Get22(), pTransform->Get32());
}
try
{
{
bpe::StopWatch elapsedTime;
// dummy launch to set shit up
mOptixContext->launch(kCamProg_VertexRelative, 0);
char buffer[256];
sprintf(buffer, "Dummy launch took %fs\r\n", elapsedTime.GetElapsedSeconds());
OutputDebugStringA(buffer);
}
{
bpe::StopWatch elapsedTime;
mOptixContext->launch(kCamProg_VertexRelative, numV);
char buffer[256];
sprintf(buffer, "Launch took %fs for %d vertices\r\n", elapsedTime.GetElapsedSeconds(), numV);
OutputDebugStringA(buffer);
}
// now copy the output buffer into the mesh AO buffer
{
float* pData = (float*)buffer->map();
// invert!!
for (uint32 i = 0; i < totalV; i++)
{
pData[i] = 1.0f - pData[i];
}
pMeshInst->CreatePerVertexData(EMeshInstanceVertexDataType::kBakedAO, CBuffer::EFormat::Float, totalV, pData);
buffer->unmap();
}
}
catch (optix::Exception& e)
{
CheckError(e.getErrorCode());
}
// unbind stuff
mOptixContext->removeVariable(mOptixContext["vertex_buffer"]);
mOptixContext->removeVariable(mOptixContext["normal_buffer"]);
mOptixContext->removeVariable(mOptixContext["outAOBuffer"]);
}
Note the FlushD3D function merely ensures that the D3D immediate context is flushed.
I call RenderVertexAO sequentially in a loop with no other functions between calls.
Any help would be appreciated.
– Martin