Problems with setMaxTraceDepth

I’m porting an application from OptiX 5.1.0 to OptiX 6.0.0 and running into stack overflow errors with setMaxTraceDepth. If I disable RTX and use the old setStackSize function, it works. When I enable RTX and use setMaxTraceDepth/setMaxCallableProgramDepth, I keep getting stack overflows. At the moment I’m debugging with a minimal test case that casts a single ray per thread, and then may cast a single shadow ray from the closest hit program (though code to iteratively cast more rays is still present). The code does not currently use callable programs.
I’ve tried numerous values from 0 to 31 for setMaxTraceDepth, and several combinations with setMaxCallableProgramDepth, but I keep getting stack overflow exceptions. If I use 32 for setMaxTraceDepth, I get the following error: “Encountered a rtcore error: m_exports->rtcPipelineCreate( context, pipelineOptions, compileOptions, modules, moduleCount, pipeline ) returned (1): Invalid value)”.
Looking at the usage report, I see the direct and continuation stack sizes for the various optix programs. I would think that the total stack size should be at least rayGenContinuationStack + closestHitContinuationStack * MaxTraceDepth. Which would make the total stack larger than my directly-specified non-RTX stack.
Attempting to troubleshoot this has raised some questions:
1.) What conditions can generate a stack overflow exception? Is it only thrown when the program’s stack usage exceeds the allocated stack size, or could other conditions like requesting too much stack generate the exception?
2.) Are the various stack sizes computed from the optix programs used with the max trace depth to create a single total stack size which can be meaningfully compared with the legacy stack size?
3.) What does the rtcPipelineCreate error thrown when calling setMaxTraceDepth(32) signify?
4.) Does optix code running in RTX mode use the stack the same way code running in non-RTX mode does?
5.) Do you have any idea why the stack overflow exception keeps occurring regardless of the max trace depth? This code uses large ray payloads and requires a substantial stack, but it’s an iterative path tracer with no real recursion, and in non-RTX mode it stays within stack bounds and runs fine.

Thanks

CUDA 10.0, OptiX 6.0.0
Quadro P2000 (laptop), driver version 419.17
Windows 10, Visual Studio 2015

It seems to be the max allowed value. see:
https://devtalk.nvidia.com/default/topic/1047924/optix/-solved-rtcore-link-error-after-45-animation-frames/post/5317965/#5317965

@bdr,

m1 is right; the maximum allowed value for setMaxTraceDepth() is 31. Same goes for setMaxCallableProgramDepth(). The default depth is 5 for both types. The invalid value error you’re getting is referring to 32 being out of bounds, but I think we could probably improve that error message.

It sounds like the default values should not be causing a stack overflow exception for you. Unless you have some hidden recursion or someone snuck secret callables into your code, based on your description, you should be able to turn down the values for both depths from the default. Would you be able to easily strip your code down to something minimal that causes the problem and share it with us?

To answer your other questions:

1- Requesting too much stack should not throw an exception, it will just be disallowed with the cryptic error message you’re already getting.

2- Yes, the depth values in RTX mode are used to create an internal stack size in bytes that could be compared with the old way. We were trying to make it more intuitive and simpler since it’s often hard to know how big your stack frame needs to be in bytes.

3- Answered above.

4- RTX mode stack usage is not the same as the old code.

5- I don’t know yet why you’re getting the exception, it could be a bug in stack size computation on our end. A very large payload could be a problem. If you can help us reproduce the problem, we will track it down.


David.

I was able to reproduce the problem by modifying the SDK’s optixPathTracer example to increase the stack usage. Other than enabling RTX and some error printing, the main modification is adding a 2048 float array to PerRayData_pathtrace.
I can send the zipped source if you’d like.

Awesome. Yes, please. I’ll try the modification manually, but it’d be great to have your copy for reference anyway.


David.