CUDA 8.0 Vs CUDA 9.2

In CUDA 9.2, the return type of float2half_rn() has been changed from short (in CUDA 8.0) to __half.
Is there any specific reason behind this?

In that context, in CUDA 9.2, what is the difference between the API float2half_rn() & float2half()
where both returns the same data type unlike in CUDA 8.0 where one returns short and other returns __half().

Kindly clarify