CSI camera: maximum pixel speed vs line length

Hello!
We are developing a v4l2 based driver for our 3D light field image sensor. When implementing higher resolutions, we stumbled upon a strange problem:
Although the TRM states on page 2381 that there is no maximum line length as long as

Minimum Image Size: WxH = 32x32 Pixels
Maximum Image Size: WxH = 128 Mega Pixels

are met, we observed that the maximum line length is dependend on the transfer speed. I’ll give an example:

With 507.840.000 Hz pixelclock and 1269.600.000 Hz CSI clock (using 4 lanes), we can transfer 4k (3840x2160) pictures and smaller, e.g. 1080p (1920x1080). We cannot transfer 8k pictures (7680x4320). Following error occurs:

Oct 26 15:12:58 tegra-mini kernel: [ 6940.356258] vi vi: CSI 4 syncpt timeout, syncpt = 15, err = -11
Oct 26 15:12:58 tegra-mini kernel: [ 6940.362700] vi vi: TEGRA_CSI_DEBUG_COUNTER 0x00000000
Oct 26 15:12:58 tegra-mini kernel: [ 6940.368045] vi vi: TEGRA_CSI_CSI_CIL_STATUS 0x00000013
Oct 26 15:12:58 tegra-mini kernel: [ 6940.374822] vi vi: TEGRA_CSI_CSI_CILX_STATUS 0x00070071
Oct 26 15:12:58 tegra-mini kernel: [ 6940.380434] vi vi: TEGRA_CSI_PIXEL_PARSER_STATUS 0x000040b4
Oct 26 15:12:58 tegra-mini kernel: [ 6940.386131] vi vi: TEGRA_VI_CSI_ERROR_STATUS 0x00000004
Oct 26 15:12:58 tegra-mini kernel: [ 6940.556296] vi vi: MW_ACK_DONE syncpoint time out!
Oct 26 15:13:00 tegra-mini kernel: [ 6942.385756] vi vi: CSI 4 syncpt timeout, syncpt = 15, err = -11
Oct 26 15:13:00 tegra-mini kernel: [ 6942.391928] vi vi: TEGRA_CSI_DEBUG_COUNTER 0x00000004
Oct 26 15:13:00 tegra-mini kernel: [ 6942.397210] vi vi: TEGRA_CSI_CSI_CIL_STATUS 0x00000003
Oct 26 15:13:00 tegra-mini kernel: [ 6942.402593] vi vi: TEGRA_CSI_CSI_CILX_STATUS 0x00020030
Oct 26 15:13:00 tegra-mini kernel: [ 6942.408211] vi vi: TEGRA_CSI_PIXEL_PARSER_STATUS 0x000041b5
Oct 26 15:13:00 tegra-mini kernel: [ 6942.413955] vi vi: TEGRA_VI_CSI_ERROR_STATUS 0x00000004
Oct 26 15:13:00 tegra-mini kernel: [ 6942.555681] vi vi: MW_ACK_DONE syncpoint time out!
Oct 26 15:13:03 tegra-mini kernel: [ 6944.416477] vi vi: CSI 4 syncpt timeout, syncpt = 15, err = -11
Oct 26 15:13:03 tegra-mini kernel: [ 6944.925639] Host read timeout at address 54081a5c
Oct 26 15:13:03 tegra-mini kernel: [ 6944.927482] vi vi: MW_ACK_DONE syncpoint time out!
Oct 26 15:13:03 tegra-mini kernel: [ 6944.938085] vi vi: TEGRA_CSI_DEBUG_COUNTER 0xffffffff
Oct 26 15:13:03 tegra-mini kernel: [ 6945.445840] Host read timeout at address 5408193c
Oct 26 15:13:03 tegra-mini kernel: [ 6945.451749] vi vi: TEGRA_CSI_CSI_CIL_STATUS 0xffffffff
Oct 26 15:13:04 tegra-mini kernel: [ 6945.959528] Host read timeout at address 54081940
Oct 26 15:13:04 tegra-mini kernel: [ 6945.964644] vi vi: TEGRA_CSI_CSI_CILX_STATUS 0xffffffff
Oct 26 15:13:04 tegra-mini kernel: [ 6946.472554] Host read timeout at address 54081854
Oct 26 15:13:05 tegra-mini kernel: [ 6946.477730] vi vi: TEGRA_CSI_PIXEL_PARSER_STATUS 0xffffffff
Oct 26 15:13:05 tegra-mini kernel: [ 6946.985934] Host read timeout at address 54080584
Oct 26 15:13:05 tegra-mini kernel: [ 6946.986135] vi vi: MW_ACK_DONE syncpoint time out!

The Tegra CSI debug registers don’t provide much insight to us, since they just state that there were some bit problems. Usually mc-errs follow:

Oct 26 14:32:03 tegra-mini kernel: [ 4484.989400] mc-err:   status = 0x60012072; addr = 0x809fc680
Oct 26 14:32:03 tegra-mini kernel: [ 4484.995260] mc-err:   secure: no, access-type: write, SMMU fault: nr-nw-s

If we now change the pixelclock to 634800000 Hz, leaving the CSI speed at 1269600000, we can no longer transfer 4k pictures, but still 1080p. Trying to transfer 4k pictures generates the same error.

To us this looks like some sort of receive-buffer overflow, but the dmesg output has no leads. The TRM has a chapter about line buffer overflows (at 31.5.6 Line Buffer Overflow), but doesn’t really say how to debug them.

Has anyone ever succesfully transfered large pictures or has any insight on where to look for the overflow? Any help would be highly appreciated.

Best regards!

Olav Schwartz

Update:

We figured a way to avoid the syncpoint-timeouts, although it’s just a workaround by decreasing the pixel clock. We can now transfer 8k images without timeouts, yet the images are incomplete, as the one I have attached.
Further changes in pixelspeed to not change the outcome of the image.

Inspecting the picture we see that not even the first line is transferred completely, data stops at around 6788 pixels. The following lines decrease until there is a sharp break and no data is transferred after pixel 1020.
We confirmed our sensor settings, as we already have a working implementation on another platform. The image data you see on the left side is a functional video as well, just with a lot of black to the right.

So we come to the conclusion that somewhere in the video interface driver something breaks when using high resolutions. We are working with the 24.1 kernel.

So my question is: Has anyone ever successfully used a image sensor with resolutions higher than 4k, preferably 8k? Or better, does anyone have a hint on where to look into to clear this error?

Best regards!

Olav Schwartz

Schwartz,
Thanks for your effort. Due to lack of high MP sensor availability, we have not had a chance to bring up such sensor. In other words, resolution higher than 4K is not being worked on although TRM did mention maximum resolution for image capture. What is the sensor spec you intend to support?

Hi Chijen,
thank you for your answer. Our resolution goal would be 8k (7680x4320) at around 10FPS, maybe even 15. Maybe 6 cameras with 8K resolution and less framerate. Right now we actually reached the 1400MPix/s limit, because our sensors can produce 4k@30FPs on two CSI lanes, which with 6 cameras would be 1490 MPix/s. Even if we try 6x 4k@25 FPS (Which would only be around 1244 MPix/s), we get crazy, hard-to-reproduce errors and even complete shutdowns of the Tegra. Starting just one stream with 4k@30FPS on two lanes works fine, but after the third stream the problems start.
Debugging is very hard, since the only errors we are seeing are those “vi: MW_ACK_DONE syncpoint timed out!” errors which basically tell nothing.

Right now are working on porting our soc_camera driver to 24.2, unfortunately the device tree setup is giving us a little headache. It’s unfortunate that we started with 23.1 and now have to re-implement a good part of our driver to the compatible with the latest kernel.

I will post about the device tree problems soon.

Best regards,

Olav Schwartz

Schwartz,

For r23.2 to r24.1/r24.2, here is the posting with more details I provided earlier,
https://devtalk.nvidia.com/default/topic/973430/jetson-tx1/csi-camera-maximum-pixel-speed-vs-line-length/post/5011842/#5011842

Is this 8K sensor bayer format that will use Tegra ISP or YUV format with its own ISP? Also, what is your typical multiple camera use case?

wrong URL?

Thanks DavidSoto …

Toward end of the thread,
https://devtalk.nvidia.com/default/topic/937624/jetson-tx1/enabling-v4l2-driver-in-r24-1-by-building-kernel-from-source-code/