Feature suggestion: Optimal class compilation

Where I’m finally starting to grasp the intended use of the SDK, I think this is how class compilation should look like (in feature-complete form). Maybe I’m just guessing the obvious and it’s what’s in development anyways. The following brain dump is meant to be read in sequential order:

Class compilation, status quo of instantiation logic:

MDL instances translate to

  • code (can be identical across instances and then be shared), and
  • data (one target argument block per instance).

An MDL instance consists of

  • material definition,
  • argument expression graph structure,
  • constant parametrization of the argument expressions, and
  • built-in uniform state.

Stages of parametrization:

Definition → Graph → Graph + Consts → Graph + Consts + Uniforms → Shader Code

‘->’ denotes “gets embedded in” and a potential 1:N relationship.

Code separation:

Based on the previous section, we can derive an optimal factorization of the code into

  1. subexpressions depending on consts (run once per parametrized graph),
  2. subexpressions depending on (precalculated subexpressions based on consts and) uniforms (run once per object), and
  3. expressions depending on (precalculated subexpressions and) state (run at brutal complexity).

A full-blown material boils down to up to four shaders for geometry, surface, emission and volume. Each of those will need a certain set of material expressions, or even multiple subsets (arguments for the individual DFs), however, only step 3 needs that granularity. Steps 1 and 2 use a holistic view on the material and might as well run on the host (interpreted or JIT-compiled). While theoretically optimal to separate these two stages, we can’t expect much benefit from that in practice, so both should be grouped into some kind of “material constructor”.

Suggested implementation:

  • Allow code generation of a material constructor that transforms a target argument block and built-in uniform state into an opaque uniform block, running all subexpressions that do not depend on varying state.
  • Introduce an expression class for varying inputs that are gathered into a varying block and allow to query its layout, so that the client can link with an external shader graph.

Consequences:

  • Class compilation as fast as instance compilation, (considering instruction caching, control flow coherency and initialization time) maybe even faster!
  • Provides a natural habitat for the built-in uniform state in class compilation mode.
  • No changes to existing API are required.

The suggested implementation has been edited in-place to reflect my latest insights - the first version was based on the assumption that some misunderstood functionality was actually broken.