About performance

wu.h · May 17, 2018, 2:49am

dear developer,

We have a demand now, Hundreds of model files are added to the system

At present, we create the scene tree to take part in [img][/img]
  
The simple description is as follows:
1.Each file is group node loaded under root group.Group acceleration mode trbvh
2.Each geometric object in the file is added to the group node at the following levels:
  transform -- GeometryGroup-- GeometryInstance -- Geometry
3.After the load is completed, the memory takes up 2G. 
4.In the GTX 1060 graphics environment,render 1920*1080 resolution,frames of about 5 per second.

Is this construction scene feasible,and is it better build scene using Octree? the depth of the octree is controlled by most of the best.

Do you have any better advice, thank you.

droettger · May 17, 2018, 11:51am

I’m not sure what you mean exactly with the statement under 1.
If that means your scene structure becomes
Group (root) → Group (per file) → Transform → GeometryGroup (file contents) → GeometryInstance → Geometry
then the second level Group node is not necessary and leaving it away will speed up the runtime performance due to faster traversal through a shallower OptiX scene hierarchy.

The optimal layout if you want to move all whole models (one per file) individually would be this:
Group (root) → Transform → GeometryGroup (file contents) → GeometryInstance → Geometry
Similar if you want to move each of multiple objects inside each file individually, just many more Transforms then.

Means you should try to make your OptiX scene representation as shallow as possible. You normally never need more than a two-level BVH, which this is already: Group (root) → Transform → GeometryGroup because all OptiX nodes with “Group” in the name hold an acceleration structure.

For example, if your application scene graph has a deep transform hierarchy, like a kinematic skeleton structure, the application should flatten the transforms to get such a two-level BVH.

(Actually the same is true for a rasterizer and that’s why our own scene graph implementation in the nvpro-pipeline (open source on github) is going to great lengths to concatenate the application side transformations before generating the matrices for the render backend scene representation, plus some clever tricks to speed up validation and update of these concatenated matrices on any application side scene changes.)

If you watch my GTC 2018 OptiX Introduction talk listed here:
[url]https://devtalk.nvidia.com/default/topic/998546/optix/optix-advanced-samples-on-github/[/url]
that explains the minimal and most efficient OptiX scene graph layouts in slides 7, 8 and 10.

wu.h · May 18, 2018, 2:55am

Thank you for your reply

I should try to make your OptiX scene representation as shallow as possible.

Now, the geometric elements in the scene are very large. Most of them are millions of geometric components and tens of millions of triangles.

I have a requirement that when the scene changes, filter the rendering node and build the rendering scene graph

How can I get the camera’s visual area in Optix, or how to implement filter primitives in Optix?

droettger · May 18, 2018, 7:14am

>>Now, the geometric elements in the scene are very large. Most of them are millions of geometric components and tens of millions of triangles.<<

Then a GTX 1060 is possibly too small. Acceleration structures are quite big. This sounds more like a use case for our professional workstation boards with 24GB VRAM or more, and maybe even multiple of the NVLINK capable ones connected and running in a driver mode which supports peer-to peer access to increase the possible working set OptiX can address.

If you have millions of geometric components, does that include instances of the same geometry as well?
In that case you could share the acceleration structure among the identical geometry, greatly reducing the amount of necessary VRAM.
That’s also explained on the same slides I mentioned above.

If you mean that you want to only load the currently visible set of geometry, that makes little sense in a ray tracer where the whole scene could potentially be reached by rays (e.g. reflections or shadows of objects outside the view). Are you only shooting primary rays?

In any case, doing view frustum culling to find only the visible scene elements before uploading them into the OptiX scene graph would be your responsibility.
Our nvpro-pipeline [url]https://developer.nvidia.com/nvidia-pro-pipeline[/url] contains some fast routines which do that on the CPU or the GPU in its pipeline/dp/culling folder.