Thanks for your reply David,
Here are the answers:
Are you using OptiX SDK version 6? Yes Optix 6 and CUDA version 10
How big is your scene? It has Vertices = 4166, Triangles = 12174
What kind of primitives in your scenes? Triangles or custom geometry? Triangles only.
Is the scene contained in a single acceleration structure?
The m_rootAcceleration is assigned to m_rootGroup. There is a single geometry object that holds all the
triangles. Its GeometryInstance is a child of another GeometryGroup. An acceleration structure is assigned to this GeometryGroup. And this group is itself a child of the m_rootGroup.
So basically two acceleration structures, two groups, one geometry.
What are calc points?
calc points are RoomVertex structure. basically a float3 for the calc point and an index to windows rectangle vertices start position index in an separate array:
struct RoomVertex
{
optix::float3 point;
unsigned int winsStartIndex;
unsigned int winsCount;
};
struct RoomVertex : public Vec3f
{
int _winCount;
int _winStartIndex;
__host__ __device__ RoomVertex(float x, float y, float z)
:Vec3f(x, y, z)
{
}
};
How many rays are you actually sending, and have you verified it’s the same number of rays in both cases? Are you recording the same number of hits as well?
The number of rays depends on the solid angle subtended by the window from the calculation point of view. It gets smaller as it goes deeper into the room. The subdivision of the solid angle is done horizontally(1 deg step) and vertically(0.15 of a degree). I have not verified whether same number of rays are shoot but the code is exactly copy/paste between the two projects.
What is your launch size?
Overall there are about 18000 calc points.
In optix I simply call: m_context->launch(0, m_RoomPointsCount); m_RoomPointsCount = 18000.
In cuda :
int threadsPerBlock = 256;
int blocksPerGrid = (m_RoomPointsCount + threadsPerBlock - 1) / threadsPerBlock;
CalcDirectSky <<<blocksPerGrid, threadsPerBlock >>> (roompoints, windowsBB, m_RoomPointsCount);
What is your ray payload size in bytes?
I am using PerRayData_shadow structure already defined in the samples, which has only one boolean variable “visible”.
How is the room points & windows data structure you mentioned used & where is it accessed from?
In optix they are assigned to the context in this way:
m_roomsPoints_buffer = m_context->createBuffer(RT_BUFFER_INPUT_OUTPUT, RT_FORMAT_USER);
m_roomsPoints_buffer->setElementSize(sizeof(RoomVertex));
m_roomsPoints_buffer->setSize(m_RoomPointsCount);
RoomVertex* roomPoints_data = static_cast<RoomVertex*>(m_roomsPoints_buffer->map(0, RT_BUFFER_MAP_WRITE_DISCARD));
for (unsigned int k = 0; k < m_RoomPointsCount; ++k) {
roomPoints_data[k] = m_RoomPoints[k];
}
m_roomsPoints_buffer->unmap();
m_context["sysRoomPoints"]->set(m_roomsPoints_buffer);
// 4 vertex per window for its 4 corners
int winBBNo = m_WindowsCount * 4;
m_WindowsBB_buffer = m_context->createBuffer(RT_BUFFER_INPUT_OUTPUT, RT_FORMAT_USER);
m_WindowsBB_buffer->setElementSize(sizeof(optix::float3));
m_WindowsBB_buffer->setSize(winBBNo);
optix::float3* winsBB_data = static_cast<optix::float3*>(m_WindowsBB_buffer->map(0, RT_BUFFER_MAP_WRITE_DISCARD));
for (unsigned int k = 0; k < winBBNo; ++k) {
winsBB_data[k] = m_WindowsAABB[k];
}
m_WindowsBB_buffer->unmap();
m_context["sysWinsBB"]->set(m_WindowsBB_buffer);
How long are your raygen and anyhit programs?
I have uploaded the main .cu file here:
and the ptx file:
I didn’t change the anyhit program. It only changes the raypayload visible flag to false and terminates the ray:
RT_PROGRAM void anyhit_shadow()
{
//rtPrintf(“anyhit_shadow”);
thePrdShadow.visible = false;
rtTerminateRay();
}
Are they doing anything particularly branchy or mathy?
Not much. The only math done is to calculate min,max angles relative to calc point based on the four corners of a rectangle window. It is all on the rol.cu file, if you could have a look.
Are you launching once or multiple times in OptiX? If multiple, are you updating any variables between launches?
Not at the moment just once for all points.