The ultimate remedy for z-fighting requires extension!

Abstract

The well-known issue of z-buffer resolution quickly degrading with distance is a classical problem every game developer had been fighting with. Roughly half of the depth range corresponds to the small area just twice further from the perspective point than the near clipping plane. As this plane is usually hundreds or thousands times closer to the camera then the far clipping plane, the utilization of the z-buffer is unoptimal in most cases. Even the inverse-ranged FP z-buffer is not a remedy as the preceding computations introduce the considerable errors before the final value actually gets into the buffer.

Usually the problem is solved by overpushing the near clipping plane as far as possible or introducing an additional cameras. Both options require additional computations to be performed on per-frame basis: either the maximum distance for the near clipping plane has to be determined, or the geometry has to be sorted between different cameras or even rendered for both of them. Besides, dynamic adjustment of the near clipping plane will help to render the large scenes without artifacts only if there are no objects in close proximity which force the near clipping plane to be set close to the camera.

The W-buffer available for DirectX users is a very good alternative (still unavailable in OpenGL). To emulate it, one can simply save a gl_Position.w value in the vertex shader and interpolate it across the primitive writing it as gl_FragDepth inside the fragment shader. However, touching gl_FragDepth switches off early z-test, which is a very powerful feature making a considerable difference to the fps rate.

Moreover, even if the linear z-buffer allows to draw scenes with much larger zFar/zNear ratio (zNear could be almost zero), it is still not a remedy, really, as it works perfectly for FP z-buffer, but for integer buffer the uniform resolution up close is too low for distinguishing tiny details of the objects and at the distance it is just unreasonably high for the large objects coarsely drawn at high LODs.

The best option is to have a higher resolution of the z-buffer up-close and lower resolution at the distance. So it seems like the default option with perspective division is the right solution in general. But having all three components divided by the same value (gl_Position.w) the distribution of the z-values is “bound” to the position of zNear. So moving the zNear close to the camera or making the perspective angle wide vanishes the depth precision at further distances.

Objective

The proposed extension will extend the user control over the clipping and perspective division (which directly affects z-buffer utilization) and will make it possible to draw infinitely large scenes without introduction of additional cameras, overpushing the near clipping plane or oversizing the z-buffer. The close emulation of the real camera’s properties will become possible (zFar, zNear->0) even with standard 24bpp depth buffers.

Solution

According to the current specification the clipping volume is defined by:

-Wc <= Xc <= Wc
-Wc <= Yc <= Wc
Zmin <= Zc <= Wc

where Xc,Yc,Zc,Wc are the clip coordinates produced by the vertex shader as the components of gl_Position, and the Zmin is either -Wc or 0 depending on the value of depth mode set by glClipControl function. After the clipping, the division of {Xc,Yc,Zc} is performed by {Wc,Wc,Wc} producing the normalized device coordinates (Xndc,Yndc,Zndc).

As the upper clipping bound (which is also the division vector) assembled from 3 identical values {Wc,Wc,Wc}, then all three coordinates {Xc,Yc,Zc} are divided by the same value. This way lines in clip space transform into lines in perspective space. As the projection is made onto XY-plane, we DO want to divide Xc and Yc by the same value, but does Zc has to fall under the same restriction?

Technically, I see no reason to restrict the division vector from being assembled with unequal values (not necessarily {Wc=Wc=Wc}). By letting the vertex shader to output the division vector explicitly the additional functionality could be achieved (examples will be given below).

Extension

The proposed extension references to the shading language adding an additional output vector as part of the per-vertex output block:

out gl_PerVertex
{
vec4 gl_Position;
vec4 gl_PositionDiv;
float gl_PointSize;
float gl_ClipDistance;
float gl_CullDistance;
}

The values of gl_PositionDiv along with gl_Position are used at the fixed-function vertex post-processing stage. The view volume is defined by:

-gl_PositionDiv.x <= gl_Position.x <= gl_PositionDiv.x
-gl_PositionDiv.y <= gl_Position.y <= gl_PositionDiv.y
Zmin <= gl_Position.z <= gl_PositionDiv.z

where Zmin is either -gl_PositionDiv.z or 0 depending on the value of depth mode set by glClipControl function.

The normalized device coordinates are calculated in the following way:

Xndc = gl_Position.x / gl_PositionDiv.x
Yndc = f*gl_Position.y / gl_PositionDiv.y
Zndc = gl_Position.z / gl_PositionDiv.z

where f is 1 when the clip control origin is GL_LOWER_LEFT and -1 when the origin is GL_UPPER_LEFT as set by glClipControl function.

This way the clipping and perspective division of each of the Xc,Yc,Zc coordinates could be accomplished using distinct values and still yield the normalized device coordinates in range [-1…1] (or [0…1] for Zndc if the clip control depth mode is GL_ZERO_TO_ONE).

The forth component of gl_PositionDiv vector is used instead of gl_Position.w in the interpolation process the same way as it is described for gl_Position.w in the specification. Therefore the forth component of gl_FragCoord vector available in fragment shader holds the value:
gl_FragCoord.w = 1 / gl_PositionDiv.w

The compiler must check if any of the vertex-processing shader stages (vertex, tessellation control, tessellation evaluation or geometry shader stages) have a static write to the gl_PositionDiv variable, and if none of those have a static write to that output variable, then it must be automatically constructed using the forth component of gl_Position vector just before the primitive is consumed by the rasterization process:

gl_PositionDiv = glPosition.wwww

This will produce the same results as would be achieved without the proposed extension. Therefore writing to the gl_PositionDiv is optional (just like with gl_FragDepth in fragment shaders), but if compiler determines that any of the shader stages has a static write to the gl_PositionDiv then the shaders are responsible for writing that output variable in all cases, otherwise the results of rasterization will be undefined.

Examples

E1. Linear z-buffer requires no perspective division of Zc, therefore the perspective division vector should be assembled like:

gl_Position = ...;
gl_PositionDiv = vec4(gl_Position.ww, 1.0, gl_Position.w);

E2. The z-buffer with controlled resolution requires a few parameters to be defined:

angle - perspective angle (measured vertically);
aspect - the viewport’s ratio of x/y;
zNear - distance to the near clipping plane (corresponds to gl_FragDepth==0.0);
zHalf - distance to the point at which the half of the depth buffer’s range is utilized (gl_FragDepth==0.5);

Adding the effect of perspective can be done by the following vertex shader’s code (unoptimal but clear for demonstration purposes):

vec4 ViewPos = gl_ModelViewMatrix * gl_Vertex;
gl_Position = vec4( ViewPos.x * ctan(angle/2) / aspect,
                    ViewPos.y * ctan(angle/2),
                   -ViewPos.z - zHalf,
                    1.0 );
gl_PositionDiv  = vec4(-ViewPos.z,
                       -ViewPos.z,
                       -ViewPos.z + zHalf - 2*zNear,
                       -ViewPos.z );

The underlying math are the following. Assume “z” is the value of z-coordinate of the point after the model-view transformation (ViewPos.z in the code above). Then the resulting formula for calculating the Zndc in range [-1…1) looks like:

FragCoord.z = (-z - zHalf) / (-z + zHalf - 2*zNear);

The result is:
FragCoord.z == -1 for the points laying on the projection plane (z == -zNear);
FragCoord.z == 0 for the points laying on the control plane (z == -zHalf);
FragCoord.z → 1 when z approaches -inf, which means there is no limit for the view distance (no zFar);
abs(FragCoord.z) > 1 for the points behind the projection plane (z > -zNear);

To increase the resolution of the z-buffer at further distances one simply moves zHalf further away; the resolution up-close degrades accordingly. The best setting would be to place zHalf at 1/3 of the distance to the most distant object in the scene.

Questions

How hard would it be to implement such extension? Is it possible to run it on the existing hardware? How long could it take for the extension to become available on nVidia cards?

Updated.