TX2 fails to boot without HDMI monitor with rootfs on SATA SSD

I have tried to have L4T-R28.2-DP rootfs on a SATA SSD in order to have more storage.
I have used a Kingston SUV400S SSD, that was recognized and working fine with R28.2, and cloned an image of R28.2-DP APP partition to this SSD.

I’ve added an entry for it in MMC0 /boot/extlinux.conf/extlinux.conf with same kernel args, just changing root=/dev/sda1 instead of mmcblk0p1:

LABEL ssd
      MENU LABEL ssd kernel
      LINUX /boot/Image
      APPEND root=/dev/sda1 rw rootwait console=ttyS0,115200n8 console=tty0 OS=l4t fbcon=map:0 net.ifnames=0 memtype=0 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x03100000 nvdumper_reserved=0x2772e0000 gpt tegra_fbmem2=0x140000@0x969e9000 lut_mem2=0x2008@0x969e6000 tegraid=18.1.2.0.0 tegra_keep_boot_clocks maxcpus=6 boot.slot_suffix= boot.ratchetvalues=0.1.1 androidboot.serialno=0334916010240 bl_prof_dataptr=0x10000@0x277040000 sdhci_tegra.en_boot_part_access=1 root=/dev/sda1 rw rootwait rootfstype=ext4

So the system boots on MMC0 until u-boot, that finds extlinux.conf and loads the kernel image from MMC0 /boot, and then runs it with args for rootfs on /dev/sda1.

This works fine, I can see my rootfs having 64GB and mmc0 is just seen as an SD Card (which is convenient for editing extlinux.conf):

df -H -T
Filesystem     Type      Size  Used Avail Use% Mounted on
/dev/root      ext4       68G   13G   52G  19% /
devtmpfs       devtmpfs  8.2G     0  8.2G   0% /dev
tmpfs          tmpfs     8.3G   30M  8.3G   1% /dev/shm
tmpfs          tmpfs     8.3G   15M  8.3G   1% /run
tmpfs          tmpfs     5.3M  4.1k  5.3M   1% /run/lock
tmpfs          tmpfs     8.3G     0  8.3G   0% /sys/fs/cgroup
tmpfs          tmpfs     824M   50k  824M   1% /run/user/1001
/dev/mmcblk0p1 ext4       30G   13G   16G  43% /media/nvidia/1797fc2d-e18b-4fa5-b512-7fd7287bdcaf

But where it gets wrong is that when no monitor is plugged in HDMI, it fails to boot.
I have seen in some cases that plugging the monitor 40s after boot, it was booting a few minutes later.
In my case, having a USB hub (powered by Jetson) to be plugged could allow the system to boot only if I have only keyboard and mouse. Having my CM108 USB sound card or a USB drive make it failing to boot, even after plugging the monitor…I have to say that after 3 minutes I loosed hope and tried something else.

I’ve tried to look with serial console. Without monitor, the boot process stalls at 2 points, and then runs into faults such as:

[    2.655543] gk20a 17000000.gp10b: failed to allocate secure buffer -12                                                                                                     
    [   12.699002] random: nonblocking pool is initialized                                                                                                                        
    [   21.486911] INFO: rcu_preempt detected stalls on CPUs/tasks:                                                                                                               
    [   21.492582]  3-...: (1 GPs behind) idle=28f/140000000000000/0 softirq=15/16 fqs=5251                                                                                       
    [   21.500406]  (detected by 0, t=5252 jiffies, g=-287, c=-288, q=8)                                                                                                          
    [   21.506507] Task dump for CPU 3:                                                                                                                                           
    [   21.509732] kworker/u12:0   R  running task        0     6      2 0x00000002                                                                                               
    [   21.516803] Workqueue: events_unbound async_run_entry_fn                                                                                                                   
    [   21.522121] Call trace:                                                                                                                                                    
    [   21.524569] [<ffffffc0000865b8>] __switch_to+0xa4/0xb0                                                                                                                     
    [   21.529706] [<ffffffc0012a3030>] init_mm+0x0/0x310                                                                                                                         
    [   24.074925] Watchdog detected hard LOCKUP on cpu 3                                                                                                                         
    [   24.079600] ------------[ cut here ]------------                                                                                                                           
    [   24.084455] WARNING: at ffffffc00013ed30 [verbose debug info unavailable]                                                                                                  
    [   24.091274] Modules linked in:                                                                                                                                             
    [   24.094346]                                                                                                                                                                
    [   24.095850] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.38-tegra #1                                                                                                      
    [   24.102291] Hardware name: quill (DT)                                                                                                                                      
    [   24.105977] task: ffffffc1ecec9900 ti: ffffffc1eced8000 task.ti: ffffffc1eced8000                                                                                          
    [   24.113474] PC is at watchdog_timer_fn+0x230/0x33c                                                                                                                         
    [   24.118270] LR is at watchdog_timer_fn+0x230/0x33c                                                                                                                         
    [   24.123083] pc : [<ffffffc00013ed30>] lr : [<ffffffc00013ed30>] pstate: 600001c5                                                                                           
    [   24.130489] sp : ffffffc1ecedbb20                                                                                                                                          
    [   24.133809] x29: ffffffc1ecedbb20 x28: 0000000000000003                                                                                                                    
    [   24.139151] x27: ffffffc001279b30 x26: ffffffc1f5d16278                                                                                                                    
    [   24.144491] x25: ffffffc0012462d8 x24: ffffffc1ecedbe00                                                                                                                    
    [   24.149831] x23: 0000000000000000 x22: 0000000000000000                                                                                                                    
    [   24.155170] x21: ffffffc001279000 x20: ffffffc001246000                                                                                                                    
    [   24.160509] x19: ffffffc001246260 x18: ffffffc000bf5060                                                                                                                    
    [   24.165847] x17: 000000000000000e x16: ffffffc000b82a60                                                                                                                    
    [   24.171188] x15: 00000000efe4b99a x14: 0000000000000000                                                                                                                    
    [   24.176527] x13: 00000000efe4b99a x12: 0000000000000000                                                                                                                    
    [   24.181865] x11: 0000000000000400 x10: ffffffc001464e88                                                                                                                    
    [   24.187206] x9 : 000000000000017f x8 : 0000000000000002                                                                                                                    
    [   24.192548] x7 : ffffffc0012916b8 x6 : 0000000000000030                                                                                                                    
    [   24.197887] x5 : 0000000000000000 x4 : 0000000000000000                                                                                                                    
    [   24.203225] x3 : 0000000000000000 x2 : 0000000000010001                                                                                                                    
    [   24.208565] x1 : ffffffc1eced8000 x0 : 0000000000000026                                                                                                                    
    [   24.213904]                                                                                                                                                                
    [   24.215949] ---[ end trace 71cc46c427a3dce3 ]---                                                                                                                           
    [   24.220569] Call trace:                                                                                                                                                    
    [   24.223026] [<ffffffc00013ed30>] watchdog_timer_fn+0x230/0x33c                                                                                                             
    [   24.228868] [<ffffffc000107c7c>] __hrtimer_run_queues+0x140/0x350                                                                                                          
    [   24.234965] [<ffffffc0001086dc>] hrtimer_interrupt+0x9c/0x1e0                                                                                                              
    [   24.240717] [<ffffffc0009315dc>] tegra186_timer_isr+0x24/0x30                                                                                                              
    [   24.246468] [<ffffffc0000f5568>] handle_irq_event_percpu+0x84/0x290                                                                                                        
    [   24.252737] [<ffffffc0000f57b8>] handle_irq_event+0x44/0x74                                                                                                                
    [   24.258316] [<ffffffc0000f8ac0>] handle_fasteoi_irq+0xb4/0x188                                                                                                             
    [   24.264153] [<ffffffc0000f4b88>] generic_handle_irq+0x24/0x38                                                                                                              
    [   24.269902] [<ffffffc0000f4e90>] __handle_domain_irq+0x60/0xb4                                                                                                             
    [   24.275739] [<ffffffc0000815dc>] gic_handle_irq+0x5c/0xb4                                                                                                                  
    [   24.281141] [<ffffffc000084740>] el1_irq+0x80/0xf8                                                                                                                         
    [   24.285938] [<ffffffc0000e82f4>] default_idle_call+0x1c/0x2c                                                                                                               
    [   24.291600] [<ffffffc0000e8510>] cpu_startup_entry+0x1bc/0x340                                                                                                             
    [   24.297437] [<ffffffc00008ede8>] secondary_start_kernel+0x12c/0x164                                                                                                        
    [   24.303706] [<000000008008192c>] 0x8008192c

This turns into various faults, but once I’ve seen the system detecting a CPU not responding and rebooted. Usually with serial console it fails to reboot after monitor has been plugged in.

Not sure about that, but it seems that not having the serial console and no USB device, it reboots easily and the system can then successfully boot if a monitor has been plugged before it reboots.

Looking at kernel logs when it succeeds to boot after late HDMI plug in, I feel there is some race condition in linux boot depending on what devices are connected.

I have also seen something really weird. The Denver cores were detected as type 6, but version 0 instead of 2:

[    0.480062] 3100000.serial: ttyS0 at MMIO 0x3100000 (irq = 37, base_baud = 25500000) is a Tegra
    [    2.613504] console [ttyS0] enabled
    [    2.615449] 3110000.serial: ttyTHS1 at MMIO 0x3110000 (irq = 38, base_baud = 0) is a TEGRA_UART
    [    2.615716] Console: switching to colour frame buffer device 80x30
    [    2.616546] c280000.serial: ttyTHS2 at MMIO 0xc280000 (irq = 39, base_baud = 0) is a TEGRA_UART
    [    2.616781] serial-tegra 3130000.serial: RX in PIO mode
    [    2.617682] 3130000.serial: ttyTHS3 at MMIO 0x3130000 (irq = 40, base_baud = 0) is a TEGRA_UART
    [    2.644024] brd: module loaded
    [    2.648384] loop: module loaded
    [    2.648648] nct1008_nct72 7-004c: find device tree node, parsing dt
    [    2.648652] nct1008_nct72 7-004c: starting parse dt
    [    2.648739] nct1008_nct72 7-004c: success parsing dt
    [    2.648869] nct1008_nct72 7-004c: success in enabling tmp451 VDD rail
    [    2.708748] tegradc 15210000.nvdisplay: fb registered
    [    2.715216] PD DISP0 index2 UP
    [    2.723341] PD DISP1 index3 UP
    [    2.723390] gk20a 17000000.gp10b: failed to allocate secure buffer -12
    [    2.734886] PD DISP2 index4 UP
    [    2.748294] Parent Clock set for DC plld2
    [    2.749677] tmp451: Enabled overheat logging at 104.00C
    [    2.749778] nct1008_nct72 7-004c: nct1008_probe: initialized
    [    2.752006] THERMAL EST: found 3 subdevs
    [    2.752010] THERMAL EST num_resources: 0
    [    2.752012] [THERMAL EST subdev 0]
    [    2.752016] [THERMAL EST subdev 1]
    [    2.752019] [THERMAL EST subdev 2]
    [    2.752236] thermal thermal_zone8: Registering thermal zone thermal_zone8 for type thermal-fan-est
    [    2.752237] THERMAL EST: thz register success.
    [    2.752314] THERMAL EST: end of probe, return err: 0
    [    2.752347] tegra_profiler: Branch: Dev
    [    2.752348] tegra_profiler: Version: 1.112
    [    2.752349] tegra_profiler: Samples version: 39
    [    2.752351] tegra_profiler: IO version: 22
    [    2.752354] armv8_pmu: imp: 0x41, idcode: 0x1
    [    2.752357] armv8_pmu: [0] arch: AA64 PmuV3 ARM CORTEX-A57, type: 5, ver: 0
    [    2.752359] armv8_pmu: imp: 0x4e, idcode: 0x1
<b>    [    2.752360] armv8_pmu: [1] arch: AA64 PmuV3 NVIDIA (Denver), type: 6, ver: 0</b>
    [    2.752362] armv8_pmu: imp: 0x4e, idcode: 0x1
<b>    [    2.752364] armv8_pmu: [2] arch: AA64 PmuV3 NVIDIA (Denver), type: 6, ver: 0</b>
    [    2.752366] armv8_pmu: imp: 0x41, idcode: 0x1
    [    2.752368] armv8_pmu: [3] arch: AA64 PmuV3 ARM CORTEX-A57, type: 5, ver: 0
    [    2.752369] armv8_pmu: imp: 0x41, idcode: 0x1
    [    2.752371] armv8_pmu: [4] arch: AA64 PmuV3 ARM CORTEX-A57, type: 5, ver: 0
    [    2.752373] armv8_pmu: imp: 0x41, idcode: 0x1
    [    2.752375] armv8_pmu: [5] arch: AA64 PmuV3 ARM CORTEX-A57, type: 5, ver: 0
    [    2.752495] tegra_profiler: auth: init
    [    2.755279] tegra-ahci 3507000.ahci-sata: AHCI 0001.0301 32 slots 2 ports 3 Gbps 0x1 impl platform mode
    [    2.755285] tegra-ahci 3507000.ahci-sata: flags: 64bit ncq sntf pm led pmp pio slum part deso sadm apst 
    [    2.755925] scsi host0: tegra_ahci
    [    2.756155] scsi host1: tegra_ahci
    [    2.756267] ata1: SATA max UDMA/133 mmio [mem 0x03507000-0x03508fff] port 0x100 irq 25
    [    2.756269] ata2: DUMMY
    [    2.757049] spi-tegra114 3210000.spi: Static pin configuration used
    [    2.757416] spi-tegra114 c260000.spi: Static pin configuration used
    [    2.757746] spi-tegra114 3240000.spi: Static pin configuration used
    [    2.758373] tun: Universal TUN/TAP device driver, 1.6
    [    2.758375] tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
    [    2.758513] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
    [    2.758515] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
    [    2.758542] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.3.0-k
    [    2.758543] igb: Copyright (c) 2007-2014 Intel Corporation.
    [    2.759365] PPP generic driver version 2.4.2
    [    2.759462] PPP BSD Compression module registered
    [    2.759467] PPP Deflate Compression module registered
    [    2.759481] PPP MPPE Compression module registered
    [    2.759489] NET: Registered protocol family 24
    [    2.759530] usbcore: registered new interface driver r8152
    [    2.759557] usbcore: registered new interface driver asix
    [    2.759582] usbcore: registered new interface driver ax88179_178a
    [    2.759603] usbcore: registered new interface driver cdc_ether
    [    2.759634] usbcore: registered new interface driver smsc75xx
    [    2.759655] usbcore: registered new interface driver net1080
    [    2.759676] usbcore: registered new interface driver cdc_subset
    [    2.759697] usbcore: registered new interface driver zaurus
    [    2.759732] usbcore: registered new interface driver cdc_ncm
    [    2.759899] Wake76 for irq=199
    [    2.759900] Wake77 for irq=199
    [    2.759901] Wake78 for irq=199
    [    2.759902] Wake79 for irq=199
    [    2.759903] Wake80 for irq=199
    [    2.759904] Wake81 for irq=199
    [    2.759905] Wake82 for irq=199
    [    2.759952] tegra-xotg xotg: usb2 phy is not available yet
    [    2.760385] usbcore: registered new interface driver usb-storage
    [    2.760438] usbcore: registered new interface driver usbserial
    [    2.760461] usbcore: registered new interface driver cp210x
    [    2.760477] usbserial: USB Serial support registered for cp210x
    [    2.760497] usbcore: registered new interface driver ftdi_sio
    [    2.760511] usbserial: USB Serial support registered for FTDI USB Serial Device
    [    2.760530] usbcore: registered new interface driver option
    [    2.760545] usbserial: USB Serial support registered for GSM modem (1-port)
    [    2.760564] usbcore: registered new interface driver pl2303
    [    2.760577] usbserial: USB Serial support registered for pl2303
    [    2.760771] tegra-usb-cd usb_cd: otg phy is not available yet
    [    2.762784] eqos 2490000.ether_qos: Setting local MAC: 0 4 4b <s>xx xx xx</s>
    [    2.763285] libphy: dwc_phy: probed
    [    2.763737] tegra-xudc-new 3550000.xudc: usb2 phy is not available yet
    [    2.828148] bcm54xx_low_power_mode(): put phy in iddq-lp mode
    [    2.892656] max77686-rtc max77620-rtc: rtc core: registered max77620-rtc as rtc0
    [    2.894899] tegra_rtc c2a0000.rtc: rtc core: registered c2a0000.rtc as rtc1
    [    2.894912] tegra_rtc c2a0000.rtc: Tegra internal Real Time Clock
    [    2.895066] i2c /dev entries driver
    [    2.895820] [OV5693]: probing v4l2 sensor.
    [    3.246977] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    [    3.247219] ata1.00: ATA-11: KINGSTON SUV400S37120G, 0C3FD6SD, max UDMA/133
    [    3.247223] ata1.00: 234441648 sectors, multi 1: LBA48 NCQ (depth 31/32)
    [    3.247511] ata1.00: configured for UDMA/133
    [    3.280101] edid invalid
    [    3.332162] edid invalid
    [    3.335517] edid invalid
    [    3.338822] tegradc 15210000.nvdisplay: hdmi: pclk:50349K, set prod-setting:prod_c_54M
    [    3.365959] scsi 0:0:0:0: Direct-Access     ATA      KINGSTON SUV400S D6SD PQ: 0 ANSI: 5
    [    3.376979] sd 0:0:0:0: [sda] 234441648 512-byte logical blocks: (120 GB/112 GiB)
    [    3.386084] sd 0:0:0:0: [sda] 4096-byte physical blocks
    [    3.395292] sd 0:0:0:0: [sda] Write Protect is off
    [    3.400891] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
    [    3.400965] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
    [    3.424167]  sda: sda1 sda2 sda3
    [    3.429378] sd 0:0:0:0: [sda] Attached SCSI disk
    [    3.479285] tegradc 15210000.nvdisplay: blank - normal
    [    3.482551] tegradc 15210000.nvdisplay: blank - powerdown
    [    3.541046] PD DISP2 index4 DOWN
    [    3.542157] PD DISP1 index3 DOWN
    [    3.542840] PD DISP0 index2 DOWN
    [    3.557346] tegradc 15210000.nvdisplay: unblank
    [    3.557410] PD DISP0 index2 UP
    [    3.563294] PD DISP1 index3 UP
    [    3.563848] PD DISP2 index4 UP
    [    3.579350] Parent Clock set for DC plld2
    [    3.582814] tegradc 15210000.nvdisplay: hdmi: pclk:27027K, set prod-setting:prod_c_54M
    [    3.594397] random: nonblocking pool is initialized

…maybe some microcode failed to load.

Any idea of what’s going wrong ?

Of course, it works fine without monitor from MMC0 and this setup is not officially supported, but I think it is easy to reproduce if NVIDIA wants to investigate.

I have seen the same Denver wrong version just booting standard R28.2-DP from mmc0.

Same sequence as above (tegradc nvdisplay registering frame buffer before gk20a issues the secure buffer allocation failure message).

I haven’t had the same errors, but just put R28.2 prerelease on the Jetson and noticed it says rev 2 in dmesg, but after “nvpmodel -m0” enables the Denver cores the “/proc/cpuinfo” lists rev 0. Dmesg and cpuinfo do not match.

I still haven’t found how to trigger it in a reproducible way. Seems to happen quite randomly, but rebooting and checking I see the Denver wrong version in almost 1 among 5 trials…
I have also removed my SATA ssd, my USB disk, my USB sound card, no SD Card, just kept wired ethernet and a USB hub with mouse and keyboard…it happens too. My monitor keeps plugged in HDMI all the time for this test.

Hi HP,

May I ask if this summary is correct: “system must need a HDMI monitor plugged if boot from SSD. Otherwise, it would have gpu error during fb register.”

Does this issue happen after rel-28.2 DP? Have you tried it on rel-28.1?

Hi Wayne,

Well, I’ve reported 2 problems in this topic. They may have the same cause, but I don’t really know:

1. R28.2-DP fails to boot without HDMI Monitor plugged:
This happens only with R28-DP when rootfs is on my SSD (plugged into TX2 SATA port, no cable).
The error message from GPU driver about secure buffer allocation failure can also be seen in R28.2-DP when it works with monitor plugged in.
Booting a R28.1 Linux image with a R28.1 rootfs on same SSD without monitor succeeds (note that early stages till uboot are from R28.2-DP on MMC0, though).

2. Denver cores detected as version 0 instead of version 2:
I have only seen this on R28.2-DP (but not checked extensively on R28.1, though).
It happens randomly even in standard configuration on MMC0 (about once in 5 trials for me).
It seems this happens when tegradc nvdisplay registers frame buffer before gk20a issues the secure buffer allocation failure message. Looks like a race condition between fb / gpu.
Unless it is related to my TX2, you should be able to reproduce this, just rebooting and checking dmesg less than 10 times.

Looks like the first issue is easier to reproduce. I can try this first.

Hi Honey_Patouceul,

We tried your problem 1, the SSD can be boot with monitor plugged in or unplugged on R28-DP/TX2.
After bootup without monitor plugged, we tried to hot plugged-in hdmi cable, it works too.

Test SSD: ADATA XS900 512GB

Hi Carolyuu,

Thanks for for testing.

Don’t you see any error in serial console ? As said before, in some cases, it may reset after some time, and if monitor has been plugged in before it resets, then it can successfully boot in my case.

Otherwise, the problem may be related either to my SSD (Kingston SUV400S) or my TX2.

Would you please send me the output of gdisk -l for your SSD ? Mine has first partition starting at sector 2048, while MMC0 has APP starting at 4096:

sudo gdisk -l /dev/sda
GPT fdisk (gdisk) version 1.0.1

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 234441648 sectors, 111.8 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 8E090E0A-B71B-4226-92D5-DF936ED1DE70
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 234441614
Partitions will be aligned on 2048-sector boundaries
Total free space is 2925 sectors (1.4 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048       134219775   64.0 GiB    0700  L4T
   2       134219776       150996991   8.0 GiB     8200  SWAP
   3       150996992       234440703   39.8 GiB    8300  LOCAL

Did you use the same boot args as my extlinux.conf entry in post #1 ?

Thanks

I am curious if lsmod of the two cases (working versus not) is the same…a mix of drivers may have an effect on it.

lsmod gives exactly the same when booting with rootfs on SDD and monitor plugged as when booting with rootfs on MMC0. For reference:

lsmod
Module                  Size  Used by
fuse                   82192  5
snd_usb_audio         168080  2
snd_usbmidi_lib        24102  1 snd_usb_audio
snd_hwdep               7566  1 snd_usb_audio
bcmdhd               7441995  0
pci_tegra              61290  0
bluedroid_pm           11195  0

Also checked the /lib/modules and /lib/firmware are exactly the same on both rootfs.

Considering that it succeeds to boot a R28.1 Image with rootfs on my SATA SSD, I think we can rule out the SSD partition table.

Seems this bug happens only with R28.2-DP and some conditions.

Are there several revisions of TX2 ? If yes, how to know which one I have ? Same for the carrier board.

I don’t know the details, but my TX2 was from the very first lot of TX2 dev kits NVIDIA had. There were some earlier kits which had WiFi issues which were traced back to mismatched firmware, so perhaps there is a difference…but I couldn’t tell you if there is any firmware related to GPU or SATA. I would be very curious to find out if there are any variations based on revision.

Just got a new Crucial MX500 250GB SSD (CT250MX500SSD1) and tried.
Exactly the same problem.

Although we cannot hit kernel error, we indeed see “gk20a 17000000.gp10b: failed to allocate secure buffer -12” before fb registered. But I remember that error always exists even if not boot from ssd.

Yes, this trace is always there, even when it works fine.

Problem2 is seen when this trace appears after the trace

tegradc 15210000.nvdisplay: fb registered

I would like to see the full dmesg. Could you share?

Here is the log from serial console when it fails to boot without monitor.
FailsWithoutMonitor.log (47.1 KB)

Flashed with last L4T R28.2-DP2 and found it was able to boot withour HDMI monitor connected.

I have been able to boot R28.2-DP1 kernel (with USB_ACM support that R28.2-DP2 doesn’t have) and L4TR28.2-DP1 rootfs without monitor connected using an extlinux.conf entry like this:

LABEL sata0p1-R28.2-DP1
      MENU LABEL sata0p1-R28.2-DP1 kernel
      LINUX /boot/Image-R28.2-DP1
      APPEND ${cbootargs} root=/dev/sda1 rw rootwait rootfstype=ext4

After some experiments, it seems it is failing to boot without monitor with these args:

LABEL sata0p1-R28.2-DP1-oldArgs
      MENU LABEL sata0p1-R28.2-DP1-oldArgs kernel
      LINUX /boot/Image-R28.2-DP1
      APPEND root=/dev/sda1 rw rootwait console=ttyS0,115200n8 console=tty0 OS=l4t fbcon=map:0 net.ifnames=0 memtype=0 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x03100000 nvdumper_reserved=0x2772e0000 gpt <b>tegra_fbmem2=0x140000@0x969e9000 lut_mem2=0x2008@0x969e6000</b> tegraid=18.1.2.0.0 tegra_keep_boot_clocks maxcpus=6 boot.slot_suffix= boot.ratchetvalues=0.1.1 androidboot.serialno=0334916010240 bl_prof_dataptr=0x10000@0x277040000 sdhci_tegra.en_boot_part_access=1 root=/dev/sda1 rw rootwait rootfstype=ext4

while it succeeds with only these:

LABEL sata0p1-R28.2-DP1-newArgs
      MENU LABEL sata0p1-R28.2-DP1-newArgs kernel
      LINUX /boot/Image-R28.2-DP1
      APPEND root=/dev/sda1 rw rootwait console=ttyS0,115200n8 console=tty0 OS=l4t fbcon=map:0 net.ifnames=0 memtype=0 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x03100000 nvdumper_reserved=0x2772e0000 gpt tegraid=18.1.2.0.0 tegra_keep_boot_clocks maxcpus=6 boot.slot_suffix= boot.ratchetvalues=0.1.1 androidboot.serialno=0334916010240 bl_prof_dataptr=0x10000@0x277040000 sdhci_tegra.en_boot_part_access=1 root=/dev/sda1 rw rootwait rootfstype=ext4

Why do you add fbemem2 in kernel command?