nvcamera-daemon breakage

I’m failing to get images from an imx290 sensor with nvcamera-daemon when v4l2-ctl works. This is our setup in the DTB for the video mode (the “use_sensor_mode_id” flag is set to “true”):

mode0 { // IMX290_MODE_1948x1108
            mclk_khz = "25000";
            num_lanes = "4";
            tegra_sinterface = "serial_a";
            discontinuous_clk = "yes";
            dpcm_enable = "false";
            cil_settletime = "0";

            dynamic_pixel_bit_depth = "12";
            /* kernel expects this setting */
            pixel_t = "bayer_rggb12";
            /* nvcamera-daemon expects these settings. */
            csi_pixel_bit_depth = "12";
            mode_type = "bayer";
            pixel_phase = "rggb";

            active_w = "1948";              // X_OUT_SIZE set-up in driver
            active_h = "1096";              // Y_OUT_SIZE set-up in driver
            readout_orientation = "0";
            line_length = "4400";           // hmax set-up in driver for 30 FPS

            gain_factor = "1";
            framerate_factor = "1";
            inherent_gain = "1";
            mclk_multiplier = "6";          // must be larger than pixel_clock / mclk
            pix_clk_hz = "148500000";       // 37.125 MHz * 4 Lanes

            min_gain_val = "1";             // defined in imx290.c
            max_gain_val = "31";            // defined in imx290.c
            min_hdr_ratio = "1";
            max_hdr_ratio = "64";
            min_framerate = "30";
            max_framerate = "30";
            min_exp_time = "15";            // 4 line time (minimal 4 lines), in micro-second
            max_exp_time = "236610";
            embedded_metadata_height = "1"; // specific to the sensor
        };

I have verified more than just once that the image sensor is set-up accordingly. In fact, v4l2-ctl captures images with exactly that geometry. Telling v4l2-ctl a different geometry leads to the correct errors: either PIXEL_SHORT_LINE, PIXEL_LONG_LINE, PIXEL_RUNAWAY or missing EOF. Here a working example:

v4l2-ctl --verbose --set-fmt-video=width=1948,height=1096,pixelformat=RG12 --stream-mmap --set-ctrl bypass_mode=0 --stream-count=100 -d /dev/video0
VIDIOC_QUERYCAP: ok
VIDIOC_S_EXT_CTRLS: ok
VIDIOC_G_FMT: ok
VIDIOC_S_FMT: ok
Format Video Capture:
	Width/Height      : 1948/1096
	Pixel Format      : 'RG12'
	Field             : None
	Bytes per Line    : 4096
	Size Image        : 4489216
	Colorspace        : sRGB
	Transfer Function : Default
	YCbCr/HSV Encoding: Default
	Quantization      : Default
	Flags             : 
VIDIOC_REQBUFS: ok
VIDIOC_QUERYBUF: ok
VIDIOC_QBUF: ok
VIDIOC_QUERYBUF: ok
VIDIOC_QBUF: ok
VIDIOC_QUERYBUF: ok
VIDIOC_QBUF: ok
VIDIOC_QUERYBUF: ok
VIDIOC_QBUF: ok
VIDIOC_STREAMON: ok
	Index    : 0
	Type     : Video Capture
	Flags    : mapped, done
	Field    : None
	Sequence : 0
	Length   : 4489216
	Bytesused: 4489216
	Timestamp: 2492.284220s (Monotonic, End-of-Frame)
...

It really captures 30 frames per second.

When I try to capture using nvcamera-daemon, everything appears to look OK. This is how it thinks to set-up VI/CSI:

apply:-----------------------
NvPclSettingsApply: Reading PCL settings
PowerServiceUtils:calculateReqClock: entered
PowerServiceHw:addRequest: table size: before: 0, after:1
	request table for VI 0:
	req[0]: guID=0, stageID=SensorCapture
	req[0]: inW=1948, inH=1096, inBpp = 12, fps=30
	req[0]: outW=1948, outH=1096, outBpp=12
	req[0]: out1W=0, out1H=0, out1Bpp=0
	req[0]: out2W=0, out2H=0, out2Bpp=0
	req[0]: clock=25987500, pixelRate=148500000, timeout=900
	req[0]: isoBw=311844, timeout=900
	req[0]: non_isoBw=0, timeout=900
PowerServiceUtils:calculateReqClock: entered
PowerServiceHw:addRequest: table size: before: 0, after:1
	request table for CSI 0:
	req[0]: guID=0, stageID=SensorCapture
	req[0]: inW=1948, inH=1096, inBpp = 12, fps=30
	req[0]: outW=1948, outH=1096, outBpp=12
	req[0]: out1W=0, out1H=0, out1Bpp=0
	req[0]: out2W=0, out2H=0, out2Bpp=0
	req[0]: clock=36196872, pixelRate=148500000, timeout=900
	req[0]: isoBw=0, timeout=900
	req[0]: non_isoBw=0, timeout=900
PowerServiceHwVi:setIso: m_bwVal_Iso=311844
PowerServiceHw:setClock: PowerServiceHw[1]: requested_clock_Hz=25987500
PowerServiceHw:setClock: PowerServiceHw[0]: requested_clock_Hz=36196872
PowerServiceCore:setCameraBw: totalIsoBw=311844
NvPclSettingsUpdate: Sending Updated Settings through PCL
NvPclSettingsApply: Applying last settings through PCL
apply:+++++++++++++++++++++++

Note that it says 1948x1096@30 as input parameters to the VI and the CSI. However, a little later the following line appears:

captureErrorCallback Stream 0.0 capture 1 failed: ts 2019813512544 frame 10 error 4 data 0x00000200

This error code means that the first valid pixel line (line #0) is too short. I’ve seen similar errors occur on other lines, but always in the first three ones.

I have confirmed that

  • The image sensor outputs 30fps (by measuring the XVS pin frequency).
  • The line length transmitted by the sensor is 1948 pixels (looked up the registers in the DS90UB954 deserializer between camera and imx290).

For further debugging, I have enabled the DS90UB954 test pattern generator with a geometry of 1948x1096 pixels. It also works fine with v4l2-ctl and fails with exactly the same error in nvcamera-daemon. I was playing a bit with the TPG geometry (which is far more flexible than the image sensor) and increased the generated line width until the PIXEL_SHORT_LINE error was gone. The following behaviour was observed:

  • TPG line length == 1974 gives PIXEL_SHORT_LINE
  • TPG line length == 1978 gives PIXEL_LONG_LINE
  • TPG line length == 1976 gives Packet Payload CRC errors

This seems to indicate that the VI is set-up to accept 1948 + 28 pixel long lines by nvcamera-daemon, boldly ignoring the specified image geometry and boldly ignoring the previously printed “inW=1948” for VI.

How can that be? I’m stuck with exactly this problem for more than a week now. Could someone please explain what nvcamera-daemon is doing? I would even be willing to sign an NDA and look at the nvcamera-daemon (plus library) source code myself if you let me. No kidding!

@stefan.zegenhagen
Everything looks reasonable have you try discontinuous_clk = “no”;

@ShaneCCC: I’ve tried that, even though I know that we must set discontinuous_clk = “yes” because we have set-up the DS90UB954 with discontinuous_clk also. It gave no improvement.

Over the last week I have carefully verified every single parameter in the DTB mode definition where I knew what it does, and I have randomly played with the other ones. No success, unfortunately.

Is there a way to dump the VI Channel specific register settings such as VI_CH_FRAME_X_0 and VI_CH_FRAME_Y_0 when they are written by nvcamera-daemon or after? I would like to confirm that these are configured as described by the DTB mode spec and we can rule this out. I didn’t even find out how nvcamera-daemon gets access to these registers and which kernel driver I have to tweak for debugging info.

Not sure have you check the trace that’s we only have currently.
https://elinux.org/Jetson_TX2/28.1_Camera_BringUp

  1. Did you check the raw data is validate?
  2. I ever make this sensor working before but didn’t have the DS90UB954. Does it’s possible bypass it to narrow down the problem.
  3. Try to mark the set gain/coarse/frame_length to check if the v4l2-ctl and nvgstcapture

@ShaneCCC: I had a look at the trace and it reveals nothing new. It doesn’t show the VI channel X/Y and Cropping register setups.

As for your questions:

  1. I couldn't. nvcamera-daemon isn't working. The gstreamer plugin bayer2rgb does not work with v4l2src because it does not like the V4L buffer line stride. Raw Image converters expect a manufacturer-specific file header. Do you have any solution to "decode" a raw bayer stream?
  2. It's not easy because our own hardware has soldered the CSI-2 lanes directly to the UB954 and we don't have an adaptor for the Jetson board.
  3. We work on checking gain/exposure settings but I don't expect that these have an impact on the problem, which is that nvcamera-daemon receives PIXEL_SHORT_LINE errors when v4l2-ctl works fine. We have carefully verified the frame_length and it is OK at 1125 as the data sheet recommends.
  1. You can dump the raw data and use tool imagej to check it. But you may need to modify below alignment to 1

./camera/vi/core.h:#define TEGRA_STRIDE_ALIGNMENT 256

  1. The PIXEL_SHORT_LINE means the sensor output the line not as expect. Not about the freame_length.

@ShaneCCC:
I have some tools now to convert RAW12 frames capture with v4l2-ctl to RGB. Capturing with v4l2-ctl works with 12bit ADC and 30 FPS sensor setup.

I can confirm that I receive valid, non-distorted video data. There is no binary garbage or just black there: I can see myself moving my arm in front of the lens.

But still, nvcamera-daemon isn’t working. Same error: PIXEL_SHORT_LINE.

I am using nvcamera-daemon from the following download address:

http://developer2.download.nvidia.com/embedded/L4T/r28_Release_v1.0/BSP/Tegra186_Linux_R28.1.0_aarch64.tbz2

Can you please confirm that this is the correct version for R28.1?

Yes, that link is correct. What I can make sure the imx290 is no problem, could it be the DS90UB954 have output redundant package.

Hello Shane,

in the mean time I’ve run more tests.

If the IMX290 is attached to the Leopard Imaging MIPI adaptor on a Jetson board, both v4l2-ctl and nvcamera-daemon are able to access the sensors. With our DS90UB953/4 FPD link, only v4l2-ctl works and nvcamera-daemon does not.

I’ve done the following tests with the FPD link in between:

  • 2-lane, 30 FPS, 1080p, RAW12
  • 4-lane, 30 FPS, 1080p, RAW12
  • 2-lane, 60 FPS, 1080p, RAW12
  • 4-lane, 60 FPS, 1080p, RAW12
  • FPD Link speed of 2Gb and 4Gb
  • max. CSI-2 Lane speeds of 1.6Gb and 800Mb

And the outcome is always the same: v4l2-ctl works and nvcamera-daemon works not. How can that be? In my tests I have noticed that if I do not set-up the FPD link correctly, v4l2-ctl either captures no frames at all or stops with an error (CSIMUX_FRAME or CHANSEL_FAULT) after a few frames. This implies that v4l2-ctl is able to detect and report errors in the CSI-2 communication between TX2 and DS90UB954.

I’ve captured 10.000+ frames in a row at 30/60 FPS with a single v4l2-ctl command and no problems. The image data is valid. So I know that the hardware works and I hope you agree to that.

What does not work is nvcamera-daemon, which is unable to capture even a single frame with the DS90UB953/4 involved - and only on TX2 because on TX1 everything works fine.

The difference is how the CSI/VI registers are set-up. In case of v4l2-ctl, the kernel does the register setup and in the nvcamera-daemon case, the kernel drivers are put into “bypass” and the setup is done by nvcamera-daemon itself. The kernel’s register configuration is working and nvcamera-daemon’s is not. The errors that nvcamera-daemon reports are those of the VI or CSI and not of the ISP.

Please explain!

The only difference should be that the V4L driver in the kernel directs the VI capture to main memory, whereas nvcamera-daemon sends the image to the ISP.

nvcamera-daemon is also unable to handle errors gracefully. It chokes at the first error it detects, starts tons of new frame captures only to abort them again, throws megabytes of logging data at the user and eventually crashes. NVidia can surely do better than that!

I am starting to get angry because of the slow progress the issue is making - I am not doing this for fun. If you are unable to get your tools working yourself, make them open-source so that someone else can.

The VI driver for the v4l2-ctl and argus/gstreamer is different driver and different pipeline. And the TX2 VI driver have more check for the package that’s what’s I suspect DS90UB954 may have add extra cause this failed.

Hello Shane,

I agree with you when you say that v4l2-ctl and argus/nvcamera-daemon are using different VI drivers. I agree with you that the TX2 VI block does more checks on the validity of data packets received via CSI-2.

But my point is that the DS90UB954 is generating valid image data. I can capture this data with v4l2-ctl and the images look fine. I can make the UB954 generate invalid data and see that v4l2-ctl stops capturing.

But nvcamera-daemon and argus are both unable to capture this valid image data and I do not know how to make them capture. You said yourself that the DTB looks good. But we need these tools working.

I have experimented with cil_settletime and tried every number between 0 and 50 without success. I have tried to set-up different combinations of values for CSI timing parameters (TCLK_* and THS_*) in the UB954 without success.

Is there any way to change the parameters that nvcamera-daemon uses to configure CSI/VI registers?
Or could you tell me the register values that nvcamera-daemon uses so that I can check what might go wrong?
Or is there any way to make nvcamera-daemon behave like the in-kernel CSI/VI drivers that are used by v4l2-ctl?
Or is the only answer I’m getting here to stop the project because it’s never going to work?

hello stefan.zegenhagen,

i would suggest you to contact your serializer/deserializer vendor to check if there’s additional package includes in the signaling.
also, could you summarize what’s the failure with your all configurations.
had you testing with different sensor resolution.
thanks

@JerryChang:

We are using DS90UB954 as serializer and DS90UB954 as serializer. These chips take a CSI-2 stream as input (DS90UB953) and unpack/re-generate the stream to CSI-2 (DS90UB954) again. In that combination, they don’t generate CSI-2 packets on their own. Here’s a short summary of what we’ve already tested:

  • different video modes (1080p30, 1080p60, 720p30, 720p60)
  • different number of CSI-2 lanes (2/4)
  • different cil_settle_time values (0-50)
  • different CSI-2 timing parameters in DS90UB954
  • discontinuous clock / continuous clock in both, TX2 and UB954
  • many other parameters from the DTB
  • Image sensor data vs. UB954 TPG

All these different tests usually have the same result:

  1. v4l2-ctl works on TX1 and TX2
  2. nvcamera-daemon/libargus work on the TX1
  3. nvcamera-daemon/libargus return PIXEL_SHORT_LINE errors on TX2

To test video input, I have used the following command lines:

v4l2-ctl --set-fmt-video=width=1948,height=1096,pixelformat=RG12 --stream-mmap --set-ctrl bypass_mode=0 --stream-count=100000 -d /dev/video0

and

gst-launch-1.0 -e -v nvcamerasrc sensor-id=1 ! 'video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, format=(string)I420, framerate=(fraction)30/1' ! nvvidconv ! xvimagesink

We have two camera sensors and two UB953/4 links, one for each camera. /dev/video0 is the master camera and /dev/video1 is a slave that is clocked by /dev/video0 and cannot operate stand-alone. /dev/video0 can, however, be operated stand-alone.

Somehow, the module numbers are mixed up in nvcamera-daemon: tegra_camera_platform/modules/module0 refers to the same IMX290 sensor as /dev/video0 (as can be seen from the proc-device-tree attribute), and is connected to CSI-Port #0, but it has sensor-id=1 instead of sensor-id=0. However that comes.

Just by chance I found out yesterday that nvcamera-daemon appears to work if I start captures on both image sensors simultaneously. That got me back to studying the DTB entries and our board schematics, but there’s no obvious error in there:

  • The CSI-2 port numbers are not swapped.
  • The CSI-2 ports are correctly mapped through NVSCI to VI.
  • Clocks are not swapped: we have a separate XO generating 25 MHz and don't need extperiph1/2 at all.

Anyway, v4l2-ctl capture is working with a single camera only. So I suspect that the hardware is working correctly and that nvcamera-daemon is confusing something. However, I have no idea what that might be.

Why do you use different resolution for v4l2-ctl and nvcamerasrc? Does the 720p work for v4l2-ctl?

hello stefan.zegenhagen,

Just by chance I found out yesterday that nvcamera-daemon appears to work if I start captures on both image sensors simultaneously.

i’m wondering if gst-launch got failed with sensor-id=0 or sensor-id=1 individually,
however gst-launch works with launching both of sensors together?

please confirm this too,
thanks

In fact, we have only a single “mode0” for 1080p defined for our camera. The different resolution in the gst-launch command line expects the ISP to scale the image down, which it normally does. I’ve verified from the nvcamera-daemon PCL/SCF logs that the camera is being set-up for 1948x1096, see above.

I can confirm this.

  • A single gst-launch with sensor-id=0 gives nothing but a regular timeout because it relates to the slave camera at /dev/video1 which gets no clock if /dev/video0 is not also running.
  • A single gst-launch with sensor-id=1 gives the PIXEL_SHORT_LINE errors I mentioned above.
  • Two parallel gst-launch commands display both camera streams correctly. They must be started at the same time, e.g. from a script, not entered by hand.

I know from TX1 that gst-launch with sensor-id=1 should work and display the capture from /dev/video0. Just replacing the SOM on the same base-board (with different DTB and software, naturally) makes the error appear.

@stefan
Could you try modify the pix_clk_hz more bigger to try.

mclk_multiplier = "";
                    pix_clk_hz = "";

Hi ShaneCCC,

I’ve increased pix_clk_hz to 190MHz instead of 148.5MHz and now it works. I have also tried 297MHz and it still works. I can capture from /dev/video0 with nvcamera-daemon and with v4l2-ctl, and I can also capture from both cameras (/dev/video0 and /dev/video1) simultaneously.

I’ve left mclk_multiplier at “6” because I know from previous investigation that other values (such as 24 or 25) do not change the situation.

Is there any special set of values for pix_clk_hz and mclk_multiplier that you would suggest for our set-up? To summarize:

  • 1948x1096, 30 fps, RAW12
  • MCLK = 25 MHz
  • PIXEL_CLK = 148.5 MHz

hello stefan.zegenhagen,

theoretically, pixel_clk = mclk * mclk_multiplier
please measure your sensor output data clock to have correct pixel clock settings.
thanks