When I try to capture a frame the debugger generates an OOM exception.
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
My (open-source) application is 32 bits so address space is limited but 2GB is still available to dump the data.
I notice the crash always appears when glFlushMappedBufferRange is called. It is working if I use the old/slow glBufferSubData function to upload vertex data. So the high memory consumption is related to glFlushMappedBufferRange.
My application uses persistent and big buffers (around 8MB) for vertex streaming.
I suspect that the debugger copy the full buffer instead of a couple of useful byte or there is potentially a memory leak.
By the way I didn’t test it but I think this piece of code might be enough to reproduce the issue. It is basically the rendering code that I use to upload vertex data.
// Setup of the the vbo buffer
GLenum m_target = GL_ARRAY_BUFFER
GLbitfield common_flags = GL_MAP_WRITE_BIT | GL_MAP_PERSISTENT_BIT;
GLbitfield map_flags = common_flags | GL_MAP_FLUSH_EXPLICIT_BIT;
GLbitfield create_flags = common_flags | GL_CLIENT_STORAGE_BIT;
size_t size = 16 * 1024 * 1024; // Play with the size to see the memory impact.
glBufferStorage(m_target, size, NULL, create_flags );
uint8* m_buffer_ptr = (uint8*) glMapBufferRange(m_target, 0, size, map_flags);
// Rendering loop
while (1) {
// Upload the vbo by small chunk
for (int offset = 0; offset < size; offset += 4) {
int dummy = rand();
size_t length = 4;
memcpy(m_buffer_ptr, &dummy, length);
glFlushMappedBufferRange(m_target, offset, length);
glDrawArrays(GL_POINT, offset, 1); // Basic draw to be sure previous buffer is flushed
}
Vsync(); // End of frame
}
Sorry for the late, I just modify some tiny project and use your codes to repro the out-of-memory issue. Yes, I do see huge memory usage when do pause&capture.
We will do some investigation and let you know any news ASAP.
We found that the loop number is so big that will make LGD consume too much memory. Although each time you just flush 4 bytes, but LGD will track and save additional information which more than 4 bytes. If it times 16M, it’s a big value.
We just confirmed that your sample codes will works fine with LGD when the loop come down. It’s better to optimize your codes, since 16M drawcalls is a big hot spot.
What parameter impact the size of the overhead? Does it depend on the size of the buffer? Is it a constant value? Does it depend on the flushed data size?
The above test was just to highlight the issue.
My application is between 1K-10K draw calls. However I have severals (big) persistant buffers. So I have 2-3 flushes by draw calls.