NVENC HEVC encoding of 3840x2160 results in coded size of 3840x2176

NVENC HEVC encoding of 3840x2160 results in coded size of 3840x2176

We’re using the NVENC HEVC encoder of the NVIDIA Tesla M60 gpu (G3.8xlarge in AWS) to create a 4K HEVC 3840x2160 video stream (YUV420 8bit, Main profile, HQ preset, 50 FPS).

This results in a coded size of 3840x2176 (SPS contains pict_height_in_luma_samples of 2176) and a
conformance cropping window of 16 pixels (conf_win_bottom_offset = 8). So NVENC generates 16 lines extra lines at the bottom which need to be cropped off by the decoder.

The problem is that one of our customers has a decoder (IRD UHD HEVC) which doesn’t support more then 2160 coded lines. The customer says the NVENC HEVC video output is not following to the ETSI TS 101 154 V2.4.1 specification (table 21 on page 130) which only allows a maximum coded resolution of 3840x2160. ETSI TS 101 154 V2.4.1 can be found here: http://www.etsi.org/deliver/etsi_ts/101100_101199/101154/02.04.01_60/ts_101154v020401p.pdf

Our questions:

  1. It looks like the NVENC HEVC encoder by default outputs a coded height which is a multiple of 32 lines, is this true?
  2. Is there a setting/parameter in the NVENC API which can be used to change the coded height alignment? So it's results in a coded height of 2160 lines instead of 2176 lines? We've tried to play with minCUSize and maxCUSize properties of the NV_ENC_CONFIG_HEVC data structure but this doesn't seem to have influence on the coded height (pic_height_in_luma_samples in the output stream). The H.265 specification specifies that the pic_height_in_luma_samples shall be an integer multiple of MinCbSizeY, when minCUSize is set to 8x8, this will result in MinCbSizeY of 8, but still the pic_height_in_luma_samples stays a multiple of 32 (2176 in this case).

Here some additional information:

  • application: ffmpeg 3.4.2 with NVENC support enabled
  • platform: ubuntu server 16.04 LTS
  • gpu: NVIDIA tesla M60, but issue can also be reproduced on NVIDIA Quadro P600
  • NVIDIA driver version: 384.98
  • Video Codec SDK version: 8.0.14
  • This situation can be easily reproduced by executing the following ffmpeg commandline:

    ffmpeg -f lavfi -i smptebars=duration=5:size=3840x2160:rate=50 -c:v hevc_nvenc -s 3840x2160 ./hevc.ts
    

    The height and coded_height can be checked with ffprobe:

    ffprobe ./hevc.ts -show_streams | grep "height"
    

    This results in:

    height=2160
    coded_height=2176
    

    When analyzing the HEVC NAL units one can see that the Sequence Parameter Set contains pic_height_in_luma_samples of 2176 and conformance_window_flag of 1 and conf_win_bottom_offset of 8 (for yuv420 this value needs to be multiplied with 2, so results in 16 lines need to be cropped off the bottom).

    The spec actually says:

    HEVC UHDTV encoders shall, as a minimum, represent video with the luminance resolutions shown in table 21 and the luminance resolutions shown in table 20. … Additional luminance resolutions may be supported, but they shall be square pixel formats indicated by aspect_ratio_idc equal to “1”.

    Also, the table refers to display sizes and not to the coded sizes. So your customer is incorrect in saying that the specification only allows a maximum coded height of 2160.

    Hopefully, nVidia can comment on any possible workarounds for you. It seems to me that a UHD IRD that cannot accept 2176 is brain-damaged and needs to be repaired.

    Hi RDJ80,

    This is by design and cannot be worked-around. For generations up to Volta, the lowest size CTB encoded by NVNEC hardware for HEVC is 32x32. 2160 is not a multiple of 32. So it encodes up to the next higher multiple (2176) and sets the picture height to 2160. Any HEVC-standard-compliant decoder should be able to decode this.

    Our encoder is HEVC standard compliant (ITU H.265), and this bitstream is perfectly standard-compliant.

    Like electrodynamics mentioned above, 2160 is a display resolution and not coded resolution.

    The ETSI document says “HEVC UHDTV encoders shall, as a minimum, represent video with the luminance resolutions shown in table 21”. Your interpretation of this table that the max encoded resolution must be 2160, is incorrect. The correct interpretation is that max resolution of the displayed picture (represented picture) must be 2160, which is true in case of the coded bitstream.

    Thanks,
    Ryan Park

    Hi Ryan,

    We agree that the current NVIDIA implementation is according to the specifications.

    Fact is that there are IRD systems which unfortunately do not support this flavor.
    The manufacturer of those IRD systems claims that other encoding manufacturers as Ericsson, Thomson, NTT and Ateme are sending 2160 luminance lines.

    It might be important for NVIDIA to work with a broad range of decoders and not to diverge from other encoder implementations.
    So we were wondering if it is possible to add a configuration setting to put the decoded frame height on 2160 (min CTB size 16x16)? Or is this a real hardware limitation?

    Thanks.

    Hi Ryan,

    We are facing this important problem also.
    According to “White Paper Blu-ray Disc™ Read-Only Format” page 40.
    Table 3-5 – Allowed combinations of parameters for 3840x2160 video format
    they limit the sps, as follow:
    horizontal size of frame 3840
    vertical size of frame 2160
    aspect_ratio_idc 1
    general_frame_only_constraint_flag 1
    conf_win_ left_offset 0
    conf_win_right_ offset 0
    conf_win_top_ offset 0
    conf_win_bottom_ offset 0

    so when use nvenc encode, we get video can only play ok on some player.
    but get error on others.
    Such as xbox one s, will display a green line on top of the screen.

    So we were wondering if there is a soft work around here, from some setting ?

    Thanks.

    Hi everyone,

    This is actually not possible. The HW(across all generations) itself supports 32x32 CTB, 16x16 CTB is not supported by hardware.

    Thanks,
    Ryan Park

    Dear Ryan

    Is there any other suggested model of GPU card could support 16 x 16 CBT and easily to replace NVIDIA Tesla M60 gpu in now’s codec system?
    I mean like codec software didn’t need to have big change.

    Very appreciated.

    BR//LJ

    Turing generation do support 16x16 for default preset and 8x8 for slow/hq preset.

    Hi Thunderm,

    We work with SDK v8.0 and set preset as slow/hq (8x8), it also results in as 3840x2176.

    UPDATE:
    As far as I can tell, this is still an issue with Ampere.

    Side Note:
    I’m not sure why Nvidia doesn’t think this is an issue. I realize that the resulting stream might be h.265 compliant but the fact that it causes problems with very popular and mainstream players (software and hardware) should be taken seriously. This issue has forced our entire GPU rendering systems to be replaced with high thread count CPU systems and discontinue our use of Nvidia GPUs.

    I wonder if there was ever a solution to this?

    I’m trying to encode using NVENC HEVC in 4K but this appears not to be decodable on an Xbox One X for the reasons outlined in this thread.