Use in a multi-threaded environment

If multiple threads want to compile independent materials at the same time, do they
1.a) each want to use an own instance of the SDK,
1.b) want to use separate instances of the database and the compiler,
1.c) want to use their own compiler instances,
1.d) want to use separate databases,
1.e) none of the above?

What is variant advisable use, avoiding excessive locking?

If multiple threads want to translate expressions of the same material into separate programs (say surface, displacement, volume shaders) at the same time, is that possible?

If so, do they
2.a) want to use separate database instances,
2.b) want to use separate compiler instances,
2.c) want to use separate compiled material instances,
2.d) want to use separate backend instances,
2.e) only need to use separate link units?

What is combination is advisable use, avoiding excessive locking?

Knowing there’s LLVM under the hood, I guess that cooperative compilation/translation of a single program is not possible or at least not advisable (because it would only result in mutual exclusion)?

So the implied questions here are “which entity owns the ‘LLVMContext’?”, and “which requests can be processed concurrently and where are they serialized automatically?”

The MDL SDK supports multi-threaded compilation of materials, so 1.e) and 2.e) are the correct answers.
No explicit locking is required from your side (unless you want to call methods from separate threads on the same ILink_unit).

This is also true, while translating different expressions on the same material into separate programs. Note, that this may lead to quite some duplication of code and data (for example, used big noise tables and functions would be included in each program). With link units, you avoid this duplication, but then you can only compile with a single thread per link unit, of course.

As for LLVM, each IMdl_backend::translate_* call and each link unit uses its own LLVMContext (and thus an own LLVM Module), so no concurrency problems arise from calling those methods from different threads.

Thanks for the clear answer.

The SDK documentation states that the MDL SDK currently supports only one transaction at a time.

This must also mean that multiple threads can happily share the same transaction and safely do (non-conflicting) database operations. I think that’s what got me confused, because I guessed the transaction to be some thread context like construct to avoid locking.

[…] translating different expressions on the same material into separate programs […] may lead to quite some duplication of code and data (for example, used big noise tables and functions would be included in each program). With link units, you avoid this duplication, but then you can only compile with a single thread per link unit, of course.

That’s a very interesting point:

The same problem also exists compiling many materials. Using multiple compilations for a single material only increases the scale, really. Link units seem to be very useful, but unfortunately they still seem to have many limitations in the current release (and that’s a topic for another thread).

Our of curiosity, I tried loading a ‘base’ builtin module that forward declares ‘perlin_noise_texture’ to external code via ‘[[native()]]’, but then all other stuff in ‘base’ was missing when compiling my material code. I tried forwarding-exporting definitions from my builtin module via

export using base import *;

inspired by ‘core_definitions.mdl’, but that gave duplicate definition errors trying to load the builtin module.

Kind of closing the loop to my first post: I’d be really nice if the linking facilities provided by the SDK became a little more powerful and flexible. You could provide reference implementations for DFs (or at least the “kernel” code for those) and noise functions directly in MDL and allow the client to use them or alternatively wire them up to something existing. Ultimately there would be an SDK example that renders a simple procedural scene like a Cornell box, putting any loadable MDL material onto an object for native x86, JIT-ed CUDA / LLVM, and a GLSL path tracer. Based on those, could discuss architectural alternatives the user may encounter in a real-world integration. This way, the OptiX-SDK could easily provide a full MDL integration too. That’d be so awesome… :-)

The simple evaluation samples are in the works (except GLSL, where the sample uses a different approach)

‘base’ builtin module that forward declares ‘perlin_noise_texture’

hmmm. ::base is not forward declaring, its a regular MDL module, its just not on disk. but its handled like any other MDL module.

…and, yest we see that link units have issues in the current release. we are working on fixes

Yes, I tried hiding the (truly) built-in ‘base’ module with an (own) “built-in” module for ‘base’ with the same contents, except replacing those functions that pull in bulky data tables with forward declarations (using ‘native’ annotated functions for that). If it worked, it would allow for adding the left-out stuff later, when finally linking the kernel(s) instead of letting the SDK translate it into (potentially many) redundant copies (especially if the materials use the compiled-in uniform state and there’s one material per object, but that’s a different story and the SDK docs suggest it might be temporary solution, however, in any case there probably shouldn’t be assumptions about scene complexity).

I guess it might not be too hard to actually make it work from what is there right now (that’s why I bother elaborating).

EDIT: Looks like the “enable_ro_segment” option for PTX / LLVM backends already provides a solution.
EDIT2: Well, that was a too quick post. Not really, since in the cases discussed above the data would still be collected redundantly.

Great to hear! Are there (approximate) release dates for respective upcoming feature and fixes yet? I’d be interested in early access to whatever info you have (it would allow me to focus my coding efforts around the SDK on what won’t be provided/fixed by the next update anyways).

Its always hard to publicly comment on release schedules since there could always be unforeseen issues. It should be a low number of weeks. Ideally it will be pretty soon.

I didn’t really see any sort of confirmation/denial to this observation:

This must also mean that multiple threads can happily share the same transaction and safely do (non-conflicting) database operations. I think that’s what got me confused, because I guessed the transaction to be some thread context like construct to avoid locking.

I’m similarly confused/uncertain about transactions and “there can only be one” and best practices (both in a multi-threaded environment, but even in a single-threaded one). Is (as was suggested by @tschwinger) the expectation that a transaction is shared between threads? Should we create a transaction and keep it around for as long as possible before a commit, or should we create & commit as frequently as possible? The code in most of the examples seems to follow the former model, where a transaction is passed into helper methods (rather than helper methods creating a transaction, doing some work, and then committing).

Hi,

You are right, the current examples use the transaction in the most basic sense. For more advanced applications, you definitely want to know the possibilities and limitations upfront in order to use it correctly:

  • There is no explicit restriction regarding single and/or multi-treading support. That means you are free to use all objects (including the transaction) from as many threads as you like. However, manipulating these objects concurrently will result in undefined behavior. But in most common workflows the transaction and maybe link units are the only types of objects that are accessed from different threads simultaneously and therefore need some locking by the application.
  • Transactions are not a thread context but more like a way to bundle database operations that belong together - like constructing a material from expressions and loading resources for it. You are right, there is only one transaction at a time in the MDL SKD. Before you can create a new one, you need to commit and free the current one. Hence, you need to avoid concurrent operations when accessing it from multiple threads simultaneously. An std::mutex for instance is fine for that. As a rule of thumb: every functions that involves a transaction should be done locked, e.g., IMdl_compiler::load_module, IMdl_compiler::get_module_db_name, ITransaction::access, ITransaction::create, ITransaction::store, ...
  • Once you have an IMaterial_definition or IFunction_definition you can continue in threads to create instances, compile them, add them to link units, generate code, and so on. One exception: When you want to add multiple compiled materials to the same link unit from different threads, you also need to lock the ILink_unit::add_* calls.
  • Your second question on how long should transaction live depends on your application. It is fine the keep a transaction around for the entire lifetime of the application. If you plan to load materials and resources dynamically, you can run into memory issues with this approach.

    It is possible to unload a module by using ITransaction::remove() with the module db name as argument.
    This call will flag the module for removal. The actual removal is triggered by committing the transaction. Then, the module and all its definitions will be removed. After that, you will need to create a new transaction for further interaction with the database (but it is not necessary to create a new scope or call garbage collection directly).

    Please note that a module is only removed if it is no longer referenced by any other database element (e.g. material instances or function calls created by you). Also, all open handles to the module or any of its entities have to be released prior to calling transaction->commit().

    This means, it makes sense to have transactions committed as soon as possible. But especially the last point: ‘all open handles have to be released’ is critical for your multi-threaded architecture. I suggest to use a transaction for medium duration, e.g. when loading a model into the scene and with that all required materials in parallel. When the loading is finished, release all handles and commit the transaction. If you need to keep constructed entities for later, store them in the database (before freeing handles). But in many applications, it’s enough to keep you generated code (PTX, HLSL, …).

    Thanks, that’s quite useful information. So, to summarise, you’d recommend:

    {
      create_transaction()
      {
        load_module(transaction,...)
      }
      {
        create_material_instance(transaction,...)
      }
      {
        generate_llvm_ir(transaction,...)
      }
      transaction->commit()
    }
    

    Instead of:

    {
      create_transaction()
      load_module(transaction,...)
      transaction->commit()
    }
    {
      create_transaction()
      create_material_instance(transaction,...)
      transaction->commit()
    }
    {
      create_transaction()
      generate_llvm_ir(transaction,...)
      transaction->commit()
    }
    

    ?

    Also a very useful comment about releasing handles and what does/doesn’t need to be stored in the DB. As I’ve been going through the example files (and somewhat blindly concatenating them into an uber-example), I’ve been trying to decide where it makes sense to keep handles around, and where it makes sense to put things in the DB. It’s good to know that handles can’t cross “transaction borders” (my made-up term).

    EDIT: Actually, let me ask for clarification on that last point. So an IMaterial_definition handle can’t be kept outside of a transaction? (I’m assuming “yes that’s correct”). What about an IMaterial_instance handle? (I’m assuming … actually, I really don’t know).

    And (sorry for not even waiting for an answer to the earlier question), a related question: are there any best-practice recommendations for the scope/lifetime/sharing of API component handles?

    If I look at the examples, in example_shared.h, the configure() method only takes an INeuray* input and creates a local IMdl_compiler handle. But if I look at example_compilation.cpp (for example), all the methods that need an IMdl_compiler take it as a function parameter (rather than taking an INeuray* parameter and creating a local IMdl_compiler handle), with a global/shared one created in main().

    Okay, I’ll start with the earlier post. First, both approaches are absolutely valid.

    In an multi-threaded environment, I believe the first one is easier as you do this on multiple threads simultaneously and synchronizing these threads in order to commit the transaction can be difficult and you might end up waiting long for the sync (which is required because you need to release open handles before being able to commit).

    “transaction boundary” is fine. Don’t keep any handles when committing. No modules, definitions, instances, expressions, values, resources, …

    When it comes to what to store in the database and what not, it depends on the effort to build your construct from scratch. To get an IMaterial_definition for instance, you just call ITransaction::access and you have it. No need to keep it long term. Also no need to store it, because it’s already in the DB.
    IMaterial_instances are a bit different. When you create your instance with a complex expression graph for the parameters you might want to store it if required after the next commit. But you don’t have to store anything if you don’t want to (except for resources in your parameter expressions).

    The second part is easier. How to handle Neuray API components (compiler and co.) is up to you. You can keep them and probably safe a few cycles for calling the internal getters but you can also request them every time you need them. In the latter case you have to pass around only one pointer.
    At least at this point, all components are unique. You will get the same one every time you request them.

    Okay, thanks. I have mixed feelings about “both approaches are valid” and “is up to you”. It’s nice to know whichever way I do it, it’s not fundamentally wrong, but part of me wishes there was an obvious “you really should be doing it this way” approach :)

    I’ll try to let my test program evolve into something a bit bigger and see if there’s a natural emergence of a design that “feels right”.