USB host controller dies when attempting to receive frames from USB3 camera

Hi,

I’ve got a Jetson TX1 running L4T 3.1 and a selection of USB3 UVC video cameras.

I’m interfacing the cameras both through gstreamer and using the V4L2 api directly.

The problem I’m having is that when I attempt to start streaming, the USB bus almost immediately stops working, causing the camera to stop sending frames, the network to go offline, and all usb devices to dissapear, except for the root hubs.

At first I thought that it may be a power-delivery issue, so I tried to run the cameras through a powered USB hub, however there was no change to the issue.

All cameras tested work flawlessly on my dev machine running Ubuntu 16.04 with kernel ‘4.13.0-26-generic’

The following is an excerpt of dmesg when the camera stops responding:

[  215.992931] xhci-tegra 70090000.xusb: xHCI host not responding to stop endpoint command.
[  216.024651] xhci-tegra 70090000.xusb: Assuming host is dying, halting host.
[  216.105516] xhci-tegra 70090000.xusb: Host not halted after 16000 microseconds.
[  216.137903] xhci-tegra 70090000.xusb: Non-responsive xHCI host is not halting.
[  216.172401] xhci-tegra 70090000.xusb: Completing active URBs anyway.
[  216.195825] xhci-tegra 70090000.xusb: HC died; cleaning up
[  217.534359] tegra-xusb-mbox 70098000.mailbox: Controller firmware hang
[  217.556584] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_OWNER 0x0
[  217.579071] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_CMD 0x80000000
[  217.616055] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_DATA_IN 0x0
[  217.638913] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_DATA_OUT 0x0
[  244.064524] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [swapper/2:0]
[  244.102489] Modules linked in: bcmdhd bluedroid_pm
[  244.123152] 
[  244.139643] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.38+ #5
[  244.161165] Hardware name: jetson_tx1 (DT)
[  244.180800] task: ffffffc0fb321900 ti: ffffffc0fb32c000 task.ti: ffffffc0fb32c000
[  244.220058] PC is at __delay+0x1c/0x38
[  244.239548] LR is at __const_udelay+0x24/0x2c
[  244.259257] pc : [<ffffffc00036b61c>] lr : [<ffffffc00036b65c>] pstate: 20000145
[  244.296966] sp : ffffffc0fb32fb20
[  244.315058] x29: ffffffc0fb32fb20 x28: 0000000000000200 
[  244.335736] x27: ffffffc00146c3d0 x26: ffffffc0fb32c000 
[  244.356247] x25: ffffffc000722d78 x24: ffffffc07d26a260 
[  244.376306] x23: 00000000000010c7 x22: 0000000000000008 
[  244.396067] x21: 0000000000000000 x20: ffffff80052f0038 
[  244.415664] x19: 00000000000a1f6e x18: 0000000000000000 
[  244.434874] x17: 0000000000000000 x16: 0000000000000000 
[  244.453543] x15: 0000000000000008 x14: 0000000000000001 
[  244.471663] x13: ffffffc0f9edefbf x12: ffffff8000bec600 
[  244.489535] x11: 00000000ffffffff x10: 0000000000000008 
[  244.507422] x9 : 0000000000cdcdcd x8 : ffffffc0002f061c 
[  244.525029] x7 : ffffffc0ffe68618 x6 : 0000000000000000 
[  244.542669] x5 : 0000000000003fff x4 : 0000000000000008 
[  244.560368] x3 : 00000000002dc6c0 x2 : 0000000122e57e0f 
[  244.577611] x1 : 0000000000000009 x0 : 0000000000000013 
[  244.594424] 
[  247.111925] xhci-tegra 70090000.xusb: Stopped the command ring failed, maybe the host is dead
[  247.203898] xhci-tegra 70090000.xusb: Host not halted after 16000 microseconds.
[  247.234880] xhci-tegra 70090000.xusb: Abort command ring failed
[  247.253203] xhci-tegra 70090000.xusb: HC died; cleaning up
[  247.271329] usb 1-3: USB disconnect, device number 2
[  247.272559] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  247.272607]  2-...: (1 GPs behind) idle=7e9/140000000000000/0 softirq=4003/4003 fqs=0 
[  247.272619]  (detected by 3, t=7804 jiffies, g=3830, c=3829, q=197)
[  247.272634] Task dump for CPU 2:
[  247.272657] kworker/2:1     R  running task        0   101      2 0x00000000
[  247.272806] Workqueue: usb_hub_wq hub_event
[  247.272825] Call trace:
[  247.272944] [<ffffffc000085e54>] __switch_to+0x3c/0x48
[  247.272951] [<          (null)>]           (null)
[  247.272988] rcu_preempt kthread starved for 7804 jiffies! g3830 c3829 f0x0 s3 ->state=0x1
[  247.350863] usb 1-3.4: USB disconnect, device number 3
[  247.357209] usb 2-1: USB disconnect, device number 2
[  247.428659] usb 2-2: USB disconnect, device number 3
[  247.434895] usb 2-2.1: USB disconnect, device number 6
[  247.489604] usb 2-2.4: USB disconnect, device number 4

to me this looks like a problem with the USB driver. Any thoughts?

I’ve attached the output of “udevadm info -a /dev/video0” just in case.

udevadm-info-a-dev-video0.txt (3.88 KB)

Hi pbelanger,
Is it specific to the USB3 camera you are using or happens to all USB3 cameras?

The same problem occurs with the following cameras I’ve tested:

  • TheImagingSource DFK33UX265
  • TheImagingSource DMK33UX265
  • E-Con Systems See3CAM_10CUG
  • E-Con Systems See3CAM_CU30

All of these cameras work without issue on my development machine.

Hi pbelanger,
For running multiple USB3 devices we suggest you have a PCIe-USB hub like https://devtalk.nvidia.com/default/topic/1027100/

Currently I am running only a single USB3 camera on the device when getting this issue.

Hi pbelanger,

We tried to run v4l2 with Econ See3CAM_CU135 single USB3 camera on r28.1/TX1, but can’t repro your issue.

Test pipeline:

gst-launch-1.0 v4l2src device=/dev/video1 ! 'video/x-raw,width=1920,height=1080,framerate=60/1,format=UYVY' ! nvvidconv ! 'video/x-raw(memory:NVMM),format=NV12' ! nvoverlaysink
v4l2-ctl -d /dev/video1 --set-fmt-video=width=1920,height=1080,pixelformat=UYVY --stream-mmap --stream-count=100

Could you share your test pipeline? we can check issue again. Thanks!

Testing with the See3CAM_CU30:

Testing your second line, I seem to get results only about 50% of the time; the other 50% the command will just hang without any output, eventually leading to the crash of the XHCI controller shown above.

The first line above also shows the same behaviour. On three consecutive tests, I first had one appear to work without issue. The second test caused gstreamer to hang without any usb controller issues, and the third caused the xhci controller to hang and reset as before.

I’ve attached the gstreamer logs of the second and third test, as well as the dmesg log following the xhci hangup in the third test.

gst-hang.txt (38.5 KB)
gst-hang-controller-dies.txt (15.9 KB)
gst-hang-controller-dies-dmesg-output.txt (3.28 KB)

Hello,

We are experiencing the same problem, crashing the USB firmware controller (kernel messages below) if we attempt to use a USB 3.0 camera. We are on a NVidia dev board, and on a ConnectTech Elroy board, with both a ZED stereo camera (we have two purchased a year apart, both have same effect) and an AR0330 USB3.0 camera from eConnSystems (e-CAM30_CUMI0330_MO), with ConnectTech’s V017 bsp patches on L4T 3.1 (tried each camera, separately, on the USB 3.0 port, via an unpowered hub, via a powered hub, one camera at a time, and all configs result in same USB firmware crash).

Kernel logs:
ZED camera thru powered hub on USB3.0 receptacle (no other USB):
[ 1005.914203] usb 1-3: new high-speed USB device number 14 using xhci-tegra
[ 1005.930769] usb 2-2: new SuperSpeed USB device number 4 using xhci-tegra
[ 1005.953372] usb 2-2: New USB device found, idVendor=05e3, idProduct=0616
[ 1005.960142] usb 1-3: New USB device found, idVendor=05e3, idProduct=0610
[ 1005.967096] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 1005.974388] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 1005.990202] usb 1-3: Product: USB2.0 Hub
[ 1005.994124] usb 1-3: Manufacturer: GenesysLogic
[ 1005.998874] usb 2-2: Product: USB3.0 Hub
[ 1006.006219] usb 2-2: Manufacturer: GenesysLogic
[ 1006.017913] hub 1-3:1.0: USB hub found
[ 1006.022437] hub 1-3:1.0: 4 ports detected
[ 1006.027539] hub 2-2:1.0: USB hub found
[ 1006.034217] hub 2-2:1.0: 4 ports detected
[ 1142.878431] usb 2-2.2: new SuperSpeed USB device number 5 using xhci-tegra
[ 1142.899151] usb 2-2.2: New USB device found, idVendor=2b03, idProduct=f580
[ 1142.906087] usb 2-2.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 1142.913623] usb 2-2.2: Product: ZED
[ 1142.917274] usb 2-2.2: Manufacturer: Leopard
[ 1142.930242] uvcvideo: Found UVC 1.00 device ZED (2b03:f580)
[ 1233.861758] xhci-tegra 70090000.xusb: xHCI host not responding to stop endpoint command.
[ 1233.869844] xhci-tegra 70090000.xusb: Assuming host is dying, halting host.
[ 1233.914551] xhci-tegra 70090000.xusb: Host not halted after 16000 microseconds.
[ 1233.921862] xhci-tegra 70090000.xusb: Non-responsive xHCI host is not halting.
[ 1233.929072] xhci-tegra 70090000.xusb: Completing active URBs anyway.
[ 1233.935509] r8152 2-1:1.0 eth0: Tx status -108
[ 1233.939950] r8152 2-1:1.0 eth0: Tx status -108
[ 1233.944390] r8152 2-1:1.0 eth0: Tx status -108
[ 1233.948971] xhci-tegra 70090000.xusb: HC died; cleaning up
[ 1235.406584] tegra-xusb-mbox 70098000.mailbox: Controller firmware hang
[ 1235.413110] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_OWNER 0x0
[ 1235.419888] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_CMD 0x80000000
[ 1235.427097] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_DATA_IN 0x0
[ 1235.434046] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_DATA_OUT 0x60001f4
[ 1237.302967] uvcvideo: Failed to set UVC probe control : -19 (exp. 26).
[ 1239.710983] uvcvideo: Failed to set UVC probe control : -19 (exp. 26).
[ 1252.791910] xhci-tegra 70090000.xusb: Stopped the command ring failed, maybe the host is dead
[ 1252.838113] xhci-tegra 70090000.xusb: Host not halted after 16000 microseconds.
[ 1252.845415] xhci-tegra 70090000.xusb: Abort command ring failed
[ 1252.851331] xhci-tegra 70090000.xusb: HC died; cleaning up
[ 1252.856925] usb 1-3: USB disconnect, device number 14
[ 1252.868734] usb 2-1: USB disconnect, device number 2
[ 1252.949924] usb 2-2: USB disconnect, device number 4
[ 1252.954977] usb 2-2.2: USB disconnect, device number 5

AR0330 camera (same):
[ 29.592277] usb 2-2.2: new SuperSpeed USB device number 4 using xhci-tegra
[ 29.620463] usb 2-2.2: LPM exit latency is zeroed, disabling LPM.
[ 29.629886] usb 2-2.2: New USB device found, idVendor=2560, idProduct=c130
[ 29.637326] usb 2-2.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 29.645155] usb 2-2.2: Product: See3CAM_CU30
[ 29.649915] usb 2-2.2: Manufacturer: e-con systems
[ 29.655042] usb 2-2.2: SerialNumber: 361AC204
[ 29.667074] uvcvideo: Found UVC 1.00 device See3CAM_CU30 (2560:c130)
[ 29.701880] input: See3CAM_CU30 as /devices/70090000.xusb/usb2/2-2/2-2.2/2-2.2:1.0/input/input3
[ 29.715615] hid-generic 0003:2560:C130.0001: hiddev0,hidraw0: USB HID v1.11 Device [e-con systems See3CAM_CU30] on usb-70090000.xusb-2.2/input2
[ 75.459285] uvcvideo: Failed to query (GET_DEF) UVC control 2 on unit 2: -110 (exp. 2).
[ 91.923206] xhci-tegra 70090000.xusb: xHCI host not responding to stop endpoint command.
[ 91.931311] xhci-tegra 70090000.xusb: Assuming host is dying, halting host.
[ 91.978039] xhci-tegra 70090000.xusb: Host not halted after 16000 microseconds.
[ 91.985360] xhci-tegra 70090000.xusb: Non-responsive xHCI host is not halting.
[ 91.992577] xhci-tegra 70090000.xusb: Completing active URBs anyway.
[ 91.999119] r8152 2-1:1.0 eth0: Tx status -108
[ 92.003579] r8152 2-1:1.0 eth0: Tx status -108
[ 92.008034] r8152 2-1:1.0 eth0: Tx status -108
[ 92.012602] xhci-tegra 70090000.xusb: HC died; cleaning up
[ 93.456872] tegra-xusb-mbox 70098000.mailbox: Controller firmware hang
[ 93.463427] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_OWNER 0x0
[ 93.470215] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_CMD 0x80000000
[ 93.477434] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_DATA_IN 0x0
[ 93.484392] tegra-xusb-mbox 70098000.mailbox: XUSB_CFG_ARU_MBOX_DATA_OUT 0x60001f4
[ 112.024311] xhci-tegra 70090000.xusb: Stopped the command ring failed, maybe the host is dead
[ 112.070533] xhci-tegra 70090000.xusb: Host not halted after 16000 microseconds.
[ 112.077836] xhci-tegra 70090000.xusb: Abort command ring failed
[ 112.083754] xhci-tegra 70090000.xusb: HC died; cleaning up
[ 112.089386] usb 1-3: USB disconnect, device number 3
[ 112.094769] usb 2-1: USB disconnect, device number 2
[ 112.163671] usb 2-2: USB disconnect, device number 3
[ 112.170751] usb 2-2.2: USB disconnect, device number 4

Hi ebeall,
Do you have other cameras to try on NVIDIA dev board? We have Econ See3CAM_CU135 and it runs fine. In your test, single e-CAM30_CUMI0330_MO does not run well on NVIDIA dev board?

I’ve been testing on the Jetson TX2 and this issue with the xhci controller does not seem to be present there, so it appears that the issue only lies with the TX1 USB hardware.

Hi pbelanger,

We tried ZED usb camera, still can’t repro your issue on r28.1/TX1.
Are you flash image by JetPack3.1? or build kernel by yourself?

Thanks!

Hi DaneLLL,

Our carrier board maker just issued us with new cables, which solved the problem, which may have been a data integrity or other cable issue.

Erik