I noticed a strange behaviour of OptiX when using it with enabled exceptions:
struct StructIntersection &intersection = ray_attr.hitInformation[first_index];
rtPrintf("COMPARE THIS...: %f %f\n", intersection.t_hit, old_t_hit);
float3 bhp = ray.origin + ray.direction*intersection.t_hit;//rtTransformPoint(RT_OBJECT_TO_WORLD, back_hit_point);
float3 normal = normalize(rtTransformNormal(RT_OBJECT_TO_WORLD, intersection.normal));
const float active_t_hit = intersection.t_hit - old_t_hit;
// Copy intersectiontime and mark first_hit as invalid...
old_t_hit = intersection.t_hit;
intersection.t_hit = MAXVALUE;
//HERE
// intersection vectors
const float3 i = ray.direction; // incident direction
const float n_dot_i = dot(normal,i);
// fillMedium is identical with firstMedium
// firstMedium and fillMedium is from now on always the highest medium :-)
int firstMedium = ray_attr.materialStorage[ray_attr.mediumArrayIndex];
int secondMedium = intersection.mediumIndex;
ray_attr.type = NONE;
int result = intersection_depth_corrections(firstMedium, secondMedium, n_dot_i);
rtPrintf("...TO THIS: %f %f\n", intersection.t_hit, old_t_hit);
The output using disabled exceptions is:
COMPARE THIS…: 81.031250 0.000000
…TO THIS: 9999999.000000 81.031250
Which is fine but when enabling exception I get:
COMPARE THIS…: 81.031250 0.000000
…TO THIS: 9999999.000000 9999999.000000
Which is not fine at all.
If I move the second rtPrintf statement directly above the HERE comment everything is working fine.
MAXVALUE is a const float 9999999.000000
Any ideas what could be the problem?
My configuration is Ubuntu 12.04 gcc 4.6.4 Optix 3.0 Cuda 4.2.9 GTX 590
I try to make a simple example project that I can post here.
The output now is:
“Output should be 0: 1000000000.000000”
It appears that it is not allowed to read/write the rayPayload.
Or the program has problem with arrays…
However, uncommenting the first rtPrintf delivers the correct result.
I tested the same example under Windows 64-bit VS2010.
Everything worked the output was correct.
I tested under another system with different GPU but otherwise the same configuration -the same problem.
Seems to be a linux only problem.
I was able to reproduce your bug on our linux machine. I’m looking at the generated code from optix, and it seems OK.
// Here's where you store the 0 in the first element in test_value
mov.f32 %_Z20closest_hit_radiancev_0_f1, 0F00000000; // 0
cvt.u32.s32 $prd_ptr_u32t5, $current_prd;
add.u32 $prd_ptr_u32t5, $prd_ptr_u32t5, $stack;
st.local.f32 [$prd_ptr_u32t5+20], %_Z20closest_hit_radiancev_0_f1;
...
// Here's where we read it back out and store it in %_Z20closest_hit_radiancev_0_f3
cvt.u32.s32 $prd_ptr_u32t7, $current_prd;
add.u32 $prd_ptr_u32t7, $prd_ptr_u32t7, $stack;
ld.local.f32 %_Z20closest_hit_radiancev_0_f3, [$prd_ptr_u32t7+20];
...
// It gets moved into another PTX register
mov.b32 $copy_u32t31, %_Z20closest_hit_radiancev_0_f3;
...
// Then it gets stored in the print buffer
st.global.u32 [$target_u64t2], $copy_u32t31;
There doesn’t seem to be anything amiss here. I’m going to start inspecting the device generated SASS and see if it fares any better.
One thing to note, is that when I changed the device target from the default (sm_10) to sm_20 (by adding OPTIONS -arch sm_20 to the OPTIX_add_sample_executable), the output was 0 as expected. You might try switching your builds to sm_20 and see if the problem goes away while we look into this more.
Indeed the little test example works with sm_20.
But my real project still has the same bug with sm_20, so its just by chance that it works in the simplified example.
I just tested my previous versions of the test example to see at what point the sm_20 makes
the difference between working and not working for the test example.
Using sm_20 and this code
The program even terminates:
There should be no output: 10000000000.000000
OptiX Error: Invalid value (Details: Function “RTresult _rtContextLaunch2D(RTcontext_api*, unsigned int, RTsize, RTsize)” caught exception: Error in rtPrintf format string: “”, [7995631])
I tried this as well, and it also reproduced the problem you are seeing. I haven’t isolated what is wrong, but it appears the print buffer indicates that that rtPrintf was called twice but the second entry is empty (and hence the error you see). I’m not sure what would cause this to happen at the moment.
One thing you could do is to use regular old ‘printf’. This seemed to work just fine. You just need to compile to sm_20+ and #include <stdio.h> in your .cu file. You will have to limit the output to a specific launch index, but for some debugging purposes I think it should get you going.
if(fabs(prd_radiance.test_value[index]) > 1e-3 && launch_index == make_uint2(255,255))
printf("There should be no output: %f \n", tempValue);
The error without rtPrintfs is still there.
I used the rtPrintfs only to show the error in a simple test example.
In my big project without any rtPrintfs the error still appears.
But only when enabling exceptions but even under Windows.
Yes. This is pure Windows problem.
Yes, it cannot be fixed!
Yes, NVidia does not support OptiX under Windows correctly.
Yes, debug it can only too crazy man. This man even can’t answer.