I am trying to do some double precision calculations on a NVidia GTX 570 card with DirectCompute under DirectX 11. I use the following HLSL program to calculate the square root of 2:

I am trying to do some double precision calculations on a NVidia GTX 570 card with DirectCompute under DirectX 11. I use the following HLSL program to calculate the square root of 2:

BufferOut[0] = sqrt((double)2);

("BufferOut" was created with type "R32G32_Float"). This results in

1.4142135381698608

as opposed to the same expression evaluated on the CPU:

1.4142135623730951

The two values differ from the 8th decimal place onward - it seems, the square root was only calculated with single precision after all.

My question: why does the procedure described above not yield the correct double-precision result?

The result you are getting from the GPU is the correct single-precision result. When I compute sqrt(2) to double precision on the CPU, then assign the result to a single-precision variable, the result is 1.4142135381698608e+000.

I know nothing about DirectX 11, but I have two questions. Does HLSL support a double precision square root operation? The HLSL documentation I am looking at (http://msdn.microsoft.com/en-us/library/bb509662(v=VS.85).aspx) does not mention it, but maybe I am looking at the wrong docs?

Is R32G32_Float actually a double-precision type? The name suggests it is a 2-vector of 32-bit floats, that is, a pair of single-precision numbers.

The result you are getting from the GPU is the correct single-precision result. When I compute sqrt(2) to double precision on the CPU, then assign the result to a single-precision variable, the result is 1.4142135381698608e+000.

I know nothing about DirectX 11, but I have two questions. Does HLSL support a double precision square root operation? The HLSL documentation I am looking at (http://msdn.microsoft.com/en-us/library/bb509662(v=VS.85).aspx) does not mention it, but maybe I am looking at the wrong docs?

Is R32G32_Float actually a double-precision type? The name suggests it is a 2-vector of 32-bit floats, that is, a pair of single-precision numbers.

With DirectX 11.0, the D3D11_FEATURE_DOUBLES.DoublePrecisionFloatShaderOps value being set to TRUE indicates the hardware supports the following double-precision shader instructions: dadd, dmul, deq, dge, dlt, dne, dmin, dmax, dmov, dmovc, dtof, ftod. That's addition, multiplication, comparison, min, max, moves, and double-precision<->single-precision conversions.

With DirectX 11.1, the D3D11_FEATURE_DATA_D3D11_OPTIONS.ExtendedDoublesShaderInstructions value being set to TRUE indicates the hardware supports extended double-precision shader instructions: ddiv, drcp, and dfma. That's division, reciprocal, and fused multiply-add.

There's no hardware shader instruction defined for DirectX that computes a double-precision sqrt.

There are no DXGI formats that contain 'double-precision' data. You can create double-precision data from two 32-bit pixels, do your computations, and write out the result as two 32-bit pixels.

With DirectX 11.0, the D3D11_FEATURE_DOUBLES.DoublePrecisionFloatShaderOps value being set to TRUE indicates the hardware supports the following double-precision shader instructions: dadd, dmul, deq, dge, dlt, dne, dmin, dmax, dmov, dmovc, dtof, ftod. That's addition, multiplication, comparison, min, max, moves, and double-precision<->single-precision conversions.

With DirectX 11.1, the D3D11_FEATURE_DATA_D3D11_OPTIONS.ExtendedDoublesShaderInstructions value being set to TRUE indicates the hardware supports extended double-precision shader instructions: ddiv, drcp, and dfma. That's division, reciprocal, and fused multiply-add.

There's no hardware shader instruction defined for DirectX that computes a double-precision sqrt.

There are no DXGI formats that contain 'double-precision' data. You can create double-precision data from two 32-bit pixels, do your computations, and write out the result as two 32-bit pixels.

[font="Courier New"]BufferOut[0] = sqrt((double)2);[/font]

("BufferOut" was created with type "R32G32_Float"). This results in

1.4142135381698608

as opposed to the same expression evaluated on the CPU:

1.4142135623730951

The two values differ from the 8th decimal place onward - it seems, the square root was only calculated with single precision after all.

My question: why does the procedure described above not yield the correct double-precision result?

BufferOut[0] = sqrt((double)2);

("BufferOut" was created with type "R32G32_Float"). This results in

1.4142135381698608

as opposed to the same expression evaluated on the CPU:

1.4142135623730951

The two values differ from the 8th decimal place onward - it seems, the square root was only calculated with single precision after all.

My question: why does the procedure described above not yield the correct double-precision result?

I know nothing about DirectX 11, but I have two questions. Does HLSL support a double precision square root operation? The HLSL documentation I am looking at (http://msdn.microsoft.com/en-us/library/bb509662(v=VS.85).aspx) does not mention it, but maybe I am looking at the wrong docs?

Is R32G32_Float actually a double-precision type? The name suggests it is a 2-vector of 32-bit floats, that is, a pair of single-precision numbers.

I know nothing about DirectX 11, but I have two questions. Does HLSL support a double precision square root operation? The HLSL documentation I am looking at (http://msdn.microsoft.com/en-us/library/bb509662(v=VS.85).aspx) does not mention it, but maybe I am looking at the wrong docs?

Is R32G32_Float actually a double-precision type? The name suggests it is a 2-vector of 32-bit floats, that is, a pair of single-precision numbers.

With DirectX 11.1, the D3D11_FEATURE_DATA_D3D11_OPTIONS.ExtendedDoublesShaderInstructions value being set to TRUE indicates the hardware supports extended double-precision shader instructions: ddiv, drcp, and dfma. That's division, reciprocal, and fused multiply-add.

There's no hardware shader instruction defined for DirectX that computes a double-precision sqrt.

There are no DXGI formats that contain 'double-precision' data. You can create double-precision data from two 32-bit pixels, do your computations, and write out the result as two 32-bit pixels.

With DirectX 11.1, the D3D11_FEATURE_DATA_D3D11_OPTIONS.ExtendedDoublesShaderInstructions value being set to TRUE indicates the hardware supports extended double-precision shader instructions: ddiv, drcp, and dfma. That's division, reciprocal, and fused multiply-add.

There's no hardware shader instruction defined for DirectX that computes a double-precision sqrt.

There are no DXGI formats that contain 'double-precision' data. You can create double-precision data from two 32-bit pixels, do your computations, and write out the result as two 32-bit pixels.