Nested rtTrace does not return?

I want to compute penetrated ray against some transparent materials. In my program, launched ray hit a series of transparent walls, and new ray is launched at the back of the each wall. (I use closest hit program.) When I create the new ray, ‘rtTrace’ function nested in material program is called recursively.

When the number of the transparent walls greater than 8 , ‘rtTrace’ does not return.

How can I fix this problem?

This might have run into a stack overflow exception.

To debug issues like these you can do the following:
Enable all exceptions:
m_context->setExceptionEnabled(RT_EXCEPTION_ALL, true);
Enable printing on all launch indices:
m_context->setPrintEnabled(true);
(When you know which launch index is failing, e.g. (x,y), you can limit the printout to that launch index for individual printf debugging:
m_context->setPrintLaunchIndex(x, y);)

Only enable that application code above when debugging! It affects the emitted PTX code. Do not leave this enabled when benchmarking!

Provide an exception program per entry-point which decodes and prints the exception code. I use this:

rtDeclareVariable(uint2, launchIndex, rtLaunchIndex, );

RT_PROGRAM void exception()
{
  const unsigned int code = rtGetExceptionCode();
  rtPrintf("Exception code 0x%X at (%d, %d)\n", code, launchIndex.x, launchIndex.y);
}

You could also output a special color to your output buffer in this case to highlight the error visually.

Then run your application from a command prompt and see if there are exceptions printed by this exception program.
You’ll find the enum values for the predefined exceptions in the optix_declarations.h header, for example RT_EXCEPTION_STACK_OVERFLOW = 0x3FC.

If that’s it, increase the stack size m_context->setStackSize(size); until this doesn’t happen anymore,
or reduce your per ray payload size,
or change your algorithm from being recursive to iterative.

Thank you Detlef for your nice comment.

I have tried to display the exception, and it turned out that the problem was caused by stack overflow!

I want to know about your third idea (iterative algorithm) in detail. Should I save the some payload, intersection point and other requisite variables into a buffer, and execute rtTrace outside of the original rtTrace?

Is there any good algorithm for multi time transmission or reflection? (I also need multi reflection calculation.)

If you program a Whitted style ray tracer which needs to resolve the reflections and transmissions on transparent materials in one single frame, the recursive approach is the most natural, but exponentially memory and runtime limited.
With N bounces through purely reflection/transmission events you already have 2^(N+1) - 1 rtTrace calls, e.g. 511 for 8 bounces, per launch index in the worst case.
Mind that if you’re running under Windows, any thread on the device which will take longer than 2 seconds on the GPU device which drives the display (WDDM driver, not a Tesla in TCC driver mode) will result in a Windows Timeout Detection and Recovery (TDR) which stops and restarts the display driver.

You could resolve the recursion by implementing an own stack of rays inside the per ray payload.
That wouldn’t help much though, the memory requirements and runtime behavior will be similar because the numbers of rays shot will remain the same. That would just move the rtTrace calls which sample the BSDFs from the closesthit program to the integrator inside the ray generation program.

Now, if you could render your image with a progressive refinement algorithm, that would allow to implement an iterative path tracer which is much more memory efficient but requires a number of launches to result in a final image. Additionally that would result in global illumination on the way.
Since that would only follow one path per launch stochastically, the image would refine over multiple frames and there would never be more rtTrace calls per launch index than your path length limit would allow. Means the individual launches would be more interactive but it takes its time to resolve to the final result.
You’ll get the typical high frequency noise of the variance induced by the stochastic sampling and possible Russian Roulette path termination.