SATA Controller Marvell 9230 doesn't work on TX1

Hello,

I use the Jetson TX1 Development Kit with a SATA Controller Marvell 9230 PCIe extension card.
I use the kernel version 3.10.96-l4t-r24.1.

PCIe registration information:

[    3.179452] pci 0000:01:00.0: [1b4b:9230] type 00 class 0x010601
[    3.179493] pci 0000:01:00.0: reg 10: [io  0x8000-0x8007]
[    3.179522] pci 0000:01:00.0: reg 14: [io  0x8040-0x8043]
[    3.179551] pci 0000:01:00.0: reg 18: [io  0x8100-0x8107]
[    3.179579] pci 0000:01:00.0: reg 1c: [io  0x8140-0x8143]
[    3.179607] pci 0000:01:00.0: reg 20: [io  0x800000-0x80001f]
[    3.179637] pci 0000:01:00.0: reg 24: [mem 0x00900000-0x009007ff]
[    3.179666] pci 0000:01:00.0: reg 30: [mem 0xd0000000-0xd000ffff pref]
[    3.179769] pci 0000:01:00.0: PME# supported from D3hot
[    3.217971] pci 0000:01:00.0: BAR 6: assigned [mem 0x20000000-0x2000ffff pref]
[    3.225184] pci 0000:01:00.0: BAR 5: assigned [mem 0x13000000-0x130007ff]
[    3.231963] pci 0000:01:00.0: BAR 4: assigned [io  0x1000-0x101f]
[    3.237989] pci 0000:01:00.0: BAR 0: assigned [io  0x1020-0x1027]
[    3.244070] pci 0000:01:00.0: BAR 2: assigned [io  0x1028-0x102f]
[    3.250161] pci 0000:01:00.0: BAR 1: assigned [io  0x1030-0x1033]
[    3.256187] pci 0000:01:00.0: BAR 3: assigned [io  0x1034-0x1037]
[    3.299798] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
[    4.135950] ahci 0000:01:00.0: version 3.0
[    4.136178] ahci 0000:01:00.0: controller can do FBS, turning on CAP_FBS
[    4.499892] ahci 0000:01:00.0: AHCI 0001.0200 32 slots 8 ports 6 Gbps 0xff impl SATA mode
[    4.510320] ahci 0000:01:00.0: flags: 64bit ncq fbs pio

I get following error messages from the kernel:

[   10.009859] ata8.00: qc timeout (cmd 0xa1)
[   10.014619] ata1.00: qc timeout (cmd 0xec)
[   10.029804] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[   10.036593] ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[   10.399814] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   10.406678] mc-err: (0) csw_afiw: EMEM address decode error
[   10.412919] mc-err:   status = 0x20010031; addr = 0x7e4bb000
[   10.412965] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[   10.426078] mc-err:   secure: no, access-type: write, SMMU fault: none
[   20.409793] ata8.00: qc timeout (cmd 0xa1)
[   20.414565] ata1.00: qc timeout (cmd 0xec)
[   20.429802] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[   20.436553] ata1: limiting SATA link speed to 1.5 Gbps
[   20.442368] ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[   20.449128] ata8: limiting SATA link speed to 1.5 Gbps
[   20.809808] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
[   20.809827] mc-err: (0) csw_afiw: EMEM address decode error
[   20.809830] mc-err:   status = 0x20010031; addr = 0x7e4bb000
[   20.809833] mc-err:   secure: no, access-type: write, SMMU fault: none
[   20.836548] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   20.843625] mc-err: (0) csw_afiw: EMEM address decode error
[   20.849934] mc-err:   status = 0x20010031; addr = 0x7e4bb000
[   20.856307] mc-err:   secure: no, access-type: write, SMMU fault: none
[   50.829849] ata1.00: qc timeout (cmd 0xec)
[   50.839873] ata8.00: qc timeout (cmd 0xa1)
[   50.849799] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[   50.859806] ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[   51.219806] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
[   51.219824] mc-err: (0) csw_afiw: EMEM address decode error
[   51.219827] mc-err:   status = 0x20010031; addr = 0x7e4bb000
[   51.219829] mc-err:   secure: no, access-type: write, SMMU fault: none
[   51.246697] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

For testing I use a x86 PC with the same PCIe extension card and kernel 3.10.69. On the x86 architecture the extension card is full working.

How I get this SATA card working?
What do this errors represent?

Best regards,
Sebastian

For both Jetson and working host on x86_64, you probably want to show the relevant “sudo lspci -vvv” for that card (a side by side comparison between the two for PCIe would be useful). This might narrow down whether issues are from PCIe drivers or from SATA drivers. Note that if you just run “lspci” the slot is listed on the left, and you can narrow lspci response to just that one device using the “-s” paremeter, e.g. “sudo lspci -vvv -s ‘01:00.0’” for the “01:00.0” slot.

lspci output

TX1:

ubuntu@zfas:~$ sudo lspci -vvv -s 01:00.0
01:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller (rev 11) (prog-if 01 [AHCI 1.0])
        Subsystem: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 547
        Region 0: I/O ports at 1020 
        Region 1: I/O ports at 1030 
        Region 2: I/O ports at 1028 
        Region 3: I/O ports at 1034 
        Region 4: I/O ports at 1000 
        Region 5: Memory at 13000000 (32-bit, non-prefetchable) 
        Expansion ROM at 20000000 [disabled] 
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: 7e4ee000  Data: 0000
        Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM L0s Enabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x2, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [e0] SATA HBA v0.0 BAR4 Offset=00000004
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout+ NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
        Kernel driver in use: ahci

X86

sudo lspci -vvv -s 06:00.0
06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller (rev 11) (prog-if 01 [AHCI 1.0])
	Subsystem: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 32 bytes
	Interrupt: pin A routed to IRQ 56
	Region 0: I/O ports at dc00 
	Region 1: I/O ports at d880 
	Region 2: I/O ports at d800 
	Region 3: I/O ports at d480 
	Region 4: I/O ports at d400 
	Region 5: Memory at fbeff800 (32-bit, non-prefetchable) 
	Expansion ROM at fbee0000 [disabled] 
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0f00c  Data: 41e2
	Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Latency L0 <512ns, L1 <64us
			ClockPM- Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [e0] SATA HBA v0.0 BAR4 Offset=00000004
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
	Kernel driver in use: ahci

Do you have an idea what can be the problem?

The PCIe side of it looks like it is working correctly. The desktop had to throttle back to gen. 1 PCIe speeds, the Jetson is running at full gen. 2 speeds. Both use the ahci driver.

So if there is an issue in software odds are that it is with a difference between the desktop’s ahci driver and the Jetson’s ahci driver (the desktop probably uses a 4.x kernel, the Jetson uses the 3.x kernel).

It could also be hardware, especially cables. Do you use the same exact hard drive and SATA cables for both Jetson and desktop?

The TX1 use kernel

root@zfas:~# uname -a
Linux zfas 3.10.96-l4t-r24.1+gede73de #1 SMP PREEMPT Tue Oct 4 15:35:58 CEST 2016 aarch64 aarch64 aarch64 GNU/Linux

The x86 use kernel:

su@su-Aspire-M5811 ~ $ uname -a
Linux su-Aspire-M5811 3.10.69 #2 SMP Wed Oct 5 13:43:50 CEST 2016 x86_64 x86_64 x86_64 GNU/Linux

I use the same card. I get the error messages without a device on the card.

Can it be that the firmware of the SATA Controller has to be managed by the bios?

Embedded systems do not have a BIOS. Everything traditionally done in a desktop BIOS is done by the boot loader up until the kernel takes over. If the SATA controller is not used at or prior to kernel being loaded into memory the boot loader probably won’t do anything special…the one thing that might happen for embedded parts of the system is that the device tree blob (dtb) will set up some controller configuration (register values) prior to the kernel using the device if the register settings are something desired to be abstracted out of the kernel source itself (there is a tendency to discourage device specific values within the kernel for an architecture in general). After the kernel takes over it is possible the “/lib/firmware” files contain something to be uploaded into the controller, but this isn’t typical for a SATA controller (wireless devices almost always do this).

That said, there will be SATA setup in boot loader and dtb specific to the Jetson only for the integrated controller. Optional PCIe devices won’t be set up during boot loader stages. Should drivers be needed during boot the initrd would contain this, and possibly u-boot would be modified.

There have been issues before with DMA being incorrect going from drivers which were 32-bit armhf and being ported to 64-bit ARMv8-a. This would be consistent with the I/O error you are seeing when PCIe itself appears to be working correct. I’m not sure how to test this, I think the changes needed to go to 64-bit from 32-bit, especially any DMA, would be the place to start.

Anyone here know what he can check to see if this SATA’s errors are related to porting from 32-bit to 64-bit drivers?

Please try with the following patch.

diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
    index 6c070a9..6c06a76 100644
    --- a/drivers/pci/host/pci-tegra.c
    +++ b/drivers/pci/host/pci-tegra.c
    @@ -2791,7 +2791,7 @@ static int tegra_pcie_enable_msi(struct tegra_pcie *pcie, bool no_init)
            }

            /* setup AFI/FPCI range */
    -       msi->pages = __get_free_pages(GFP_KERNEL, 0);
    +       msi->pages = __get_free_pages(GFP_DMA32, 0);
        }
        base = virt_to_phys((void *)msi->pages);

Hey OP,
Did the patch vidyas suggest work for you?
Cause it doesn’t work for me…

Hello vidyas,

I tried your patch and it works, thanks.