Context recompilation & callable programs
Hi, In trying to optimize my program, I have narrowed down my bottlenecks to context recompilations. In the specific case in question, I call many successive launches, while switching callable programs in between calls. I seem to recall in some thread and/or the programming guide that the whole point of such callable programs is that their switching does not incur recompilation, so I guess I am using them wrong somewhere. My flow is as follows: 1. Load all programs beforehand using [code] optixu::Program switchableProg1 = m_Context->createProgramFromPTXFile( sFilePath, sProgName1 ); optixu::Program switchableProg2 = m_Context->createProgramFromPTXFile( sFilePath, sProgName2 ); optixu::Program switchableProg3 = m_Context->createProgramFromPTXFile( sFilePath, sProgName3 ); [/code] 2. Create (only once): [code] optixu::Variable pCallableProg = m_Context->declareVariable( "MyFunc" ) [/code] 3. And to toggle in between launches (this line is what triggers my recompiles) [code] pCallableProg->set( switchableProg ); [/code] Since I preloaded the progs into the context and am just attaching them to the variable, does anyone have any insights why I am getting these expensive recompiles all the time? Thanks!
Hi,
In trying to optimize my program, I have narrowed down my bottlenecks to context recompilations.
In the specific case in question, I call many successive launches, while switching callable programs in between calls.
I seem to recall in some thread and/or the programming guide that the whole point of such callable programs is that their switching does not incur recompilation, so I guess I am using them wrong somewhere.

My flow is as follows:
1. Load all programs beforehand using
optixu::Program switchableProg1 = m_Context->createProgramFromPTXFile( sFilePath, sProgName1 );
optixu::Program switchableProg2 = m_Context->createProgramFromPTXFile( sFilePath, sProgName2 );
optixu::Program switchableProg3 = m_Context->createProgramFromPTXFile( sFilePath, sProgName3 );


2. Create (only once):
optixu::Variable pCallableProg = m_Context->declareVariable( "MyFunc" )


3. And to toggle in between launches (this line is what triggers my recompiles)
pCallableProg->set( switchableProg );


Since I preloaded the progs into the context and am just attaching them to the variable, does anyone have any insights why I am getting these expensive recompiles all the time?

Thanks!

#1
Posted 01/03/2018 08:58 AM   
Changing a single bound callable program at context scope requires a recompile because the resulting mega-kernel is different. Only the programs which can actually be reached from an entry point will be compiled into the kernel at launch. It's not enough that you created the program objects. Instantaneous switching between context wide callable programs can be achieved with [b]buffers of bindless callable program IDs[/b] which build a function table you can index into to select the current function. That way the programs are all present inside the kernel and you only need to change the function table index variable between launches.
Changing a single bound callable program at context scope requires a recompile because the resulting mega-kernel is different. Only the programs which can actually be reached from an entry point will be compiled into the kernel at launch. It's not enough that you created the program objects.

Instantaneous switching between context wide callable programs can be achieved with buffers of bindless callable program IDs which build a function table you can index into to select the current function.
That way the programs are all present inside the kernel and you only need to change the function table index variable between launches.

#2
Posted 01/03/2018 02:16 PM   
Thanks Detlef. That makes perfect sense. If I understand correctly, the downside of this approach would be that the programs lose the scope of their caller? And I can manually overcome this by passing any needed attribute as a parameter?
Thanks Detlef.
That makes perfect sense.
If I understand correctly, the downside of this approach would be that the programs lose the scope of their caller? And I can manually overcome this by passing any needed attribute as a parameter?

#3
Posted 01/03/2018 02:30 PM   
Correct. Bindless callable programs only have the context and the program itself as scope. You can also not call any rtTrace or rtTransform calls in them because those need the transform hierarchy which is not existent in the context and the program scopes. I'm heavily using bindless callable programs in my ray tracers because they actually help to reduce the kernel size by reusing fixed function code. I can handle arbitrary lens shaders with a single ray generation program and only need a single closest hit and any hit program because all materials and lights are configured, sampled, and evaluated via bindless callable programs.
Correct. Bindless callable programs only have the context and the program itself as scope.
You can also not call any rtTrace or rtTransform calls in them because those need the transform hierarchy which is not existent in the context and the program scopes.

I'm heavily using bindless callable programs in my ray tracers because they actually help to reduce the kernel size by reusing fixed function code. I can handle arbitrary lens shaders with a single ray generation program and only need a single closest hit and any hit program because all materials and lights are configured, sampled, and evaluated via bindless callable programs.

#4
Posted 01/03/2018 02:43 PM   
Thanks a lot. I'll give bindless a try then.
Thanks a lot.
I'll give bindless a try then.

#5
Posted 01/03/2018 02:46 PM   
Reporting back: Converting my callable programs (~10) to bindless was a real pain, but the results are definitely worth it! By cutting the recompilations, I've cut down my flow (which includes 4 separate tracings) from 2.5s to 400ms!
Reporting back:
Converting my callable programs (~10) to bindless was a real pain, but the results are definitely worth it! By cutting the recompilations, I've cut down my flow (which includes 4 separate tracings) from 2.5s to 400ms!

#6
Posted 01/07/2018 09:23 AM   
Scroll To Top

Add Reply