Driver Crash when using too many optix objects

Hi, I obtain a driver crash (the screen becomes black and the driver reload itself) if I use too many optix objects (see below)

The issue is independent from CUDA version (tried with CUDA 6.5 and 7.5)

The issue is independent from driver type (same behaviour on QUADRO K2100M and GeForce 460M)

The issue is inside the optix 32 bit library (optix 3.5.1 works, 3.6.3 crashes, 3.7.0 crashes and 3.8.0 crashes). Have not tried with version 3.6.0 becouse the 32bit SDK contains a 64 bit library…

I’m using windows 7 64 bit on both machines. I’ using a dynamic loaded library with delphi (32 bit) since optix 2.0 so I cannot give you a visual-studio code sample (sorry)

I use the 32 bit version of Optix and I cannot use/try the 64 bit version

The issue happens in a deterministic way if I use more then 400 objetcs (mixed) of this type

Selectors, Groups, GometryGroups, Geometry, etc

The exact limit is near 400 but depends onto the combination of the objets above

For example I get NO crash with

19 x Selectors
41 x Groups
85 x Transforms
85 x Geometry Groups
85 x Geometry instances
85 x Geometry
Sum = 400

For example I get a crash with

19 x Selectors
42 x Groups
86 x Transforms
86 x Geometry Groups
86 x Geometry instances
86 x Geometry
Sum = 405

For example I get NO crash with

17 x Selectors
39 x Groups
86 x Transforms
86 x Geometry Groups
86 x Geometry instances
86 x Geometry
Sum = 400

For example I get a crash with

18 x Selectors
41 x Groups
92 x Transforms
92 x Geometry Groups
92 x Geometry instances
92 x Geometry
Sum = 427

The limit is near the value 400 (Sum=399,400,401)

From my point of view is a “short buffer” somewhere into Optix Library (32 bit)

The crash happens when a lunch the render (Optix gets a “CUDA exception” and the driver restarts within 1~2 seconds)

The issue is independent of the acceleration structures types that I use. Happens with “NoAccel” or with others combinations.

This is my analysis.

Please help becouse Optix 3.5.1, which always works (NO CRASHES) with at least twice the objects (> 900), does not work with my brand new Quadro (Details: Function “_rtContextLaunch2D” caught exception: Encountered a CUDA error: cuGLGetDevices() returned (304): Unknown, |3801520|)

Thank you

RD

It sounds like you are experiencing a Windows Timeout Detection and Recovery (TDR) error. See this post for details.

Tomorrow morning (here in Italy is 19:30) i will set the environment variable “OPTIX_API_CAPTURE” and i will try to create the minimum set of calls that create the crash.

One question: when I’m ready with the “dump” which is the e-mail address to use to send you the data?

THX

Hi, I have created the minimum set of “boxes” that crash Optix (>3.6.X)

Its a set of simple boxex created with 12 triangles each (the issue is indipendent of the mesh complexity)

Here is what I get if add one more box to te image:

Details: Function “_rtContextLaunch2D” caught exception: Encountered a CUDA error: result returned (700): Unknown, |6619204|

All details (image and traces) sent to optix-help@nvidia.com

Bye

Unfortunately that mail got stuck and is possibly wiped in the meantime by our e-mail spam servers because it had a *.zip attachment which are blocked by default.
Could you please resend the trace but rename the attachment extension to *.zi_ to let it pass through?
Thanks and sorry for the inconvenience.

OK, filename changed to *.zi_ two times (recursive)

THX

Hi, were you able to reproduce the issue with my data?

THX