I’m porting an application from OptiX 5.1.0 to OptiX 6.0.0 and running into stack overflow errors with setMaxTraceDepth. If I disable RTX and use the old setStackSize function, it works. When I enable RTX and use setMaxTraceDepth/setMaxCallableProgramDepth, I keep getting stack overflows. At the moment I’m debugging with a minimal test case that casts a single ray per thread, and then may cast a single shadow ray from the closest hit program (though code to iteratively cast more rays is still present). The code does not currently use callable programs.
I’ve tried numerous values from 0 to 31 for setMaxTraceDepth, and several combinations with setMaxCallableProgramDepth, but I keep getting stack overflow exceptions. If I use 32 for setMaxTraceDepth, I get the following error: “Encountered a rtcore error: m_exports->rtcPipelineCreate( context, pipelineOptions, compileOptions, modules, moduleCount, pipeline ) returned (1): Invalid value)”.
Looking at the usage report, I see the direct and continuation stack sizes for the various optix programs. I would think that the total stack size should be at least rayGenContinuationStack + closestHitContinuationStack * MaxTraceDepth. Which would make the total stack larger than my directly-specified non-RTX stack.
Attempting to troubleshoot this has raised some questions:
1.) What conditions can generate a stack overflow exception? Is it only thrown when the program’s stack usage exceeds the allocated stack size, or could other conditions like requesting too much stack generate the exception?
2.) Are the various stack sizes computed from the optix programs used with the max trace depth to create a single total stack size which can be meaningfully compared with the legacy stack size?
3.) What does the rtcPipelineCreate error thrown when calling setMaxTraceDepth(32) signify?
4.) Does optix code running in RTX mode use the stack the same way code running in non-RTX mode does?
5.) Do you have any idea why the stack overflow exception keeps occurring regardless of the max trace depth? This code uses large ray payloads and requires a substantial stack, but it’s an iterative path tracer with no real recursion, and in non-RTX mode it stays within stack bounds and runs fine.
Thanks
CUDA 10.0, OptiX 6.0.0
Quadro P2000 (laptop), driver version 419.17
Windows 10, Visual Studio 2015