Jetson TX1 strange network performance behaviour (still)
Hi! I encountered a lot very strange network performance behaviours with the TX1. To verify that it is not my personal setup a did the same with a TX1 on the TX1 development board and with a vanilla JetPack 3.1 installation on it. When I use the internal r8152 network card and iperf3 to measure the performance I get around 300MBits/sec when sending to the device but full around 900MBits/sec when receiving from the device. On the TX1: [code] nvidia@tegra-ubuntu:~$ iperf3 -s -p 12345 [/code] On any other working machine connected to the same GigE switch: Receiving from the TX1 [code] $iperf3 -c tegra-ubuntu.local -p 12345 -t 60 -i 10 -b 1G -R Connecting to host tegra-ubuntu.local, port 12345 Reverse mode, remote host tegra-ubuntu.local is sending [ 7] local fe80::10b9:e3af:1f20:c50a port 52937 connected to fe80::f236:19c3:65f4:d4c0 port 12345 [ ID] Interval Transfer Bitrate [ 7] 0.00-10.00 sec 1.06 GBytes 911 Mbits/sec [ 7] 10.00-20.00 sec 1.08 GBytes 928 Mbits/sec [ 7] 20.00-30.00 sec 1.08 GBytes 928 Mbits/sec [ 7] 30.00-40.00 sec 1.08 GBytes 928 Mbits/sec [ 7] 40.00-50.00 sec 1.08 GBytes 928 Mbits/sec [ 7] 50.00-60.00 sec 1.08 GBytes 928 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 7] 0.00-60.00 sec 6.46 GBytes 925 Mbits/sec 0 sender [ 7] 0.00-60.00 sec 6.46 GBytes 925 Mbits/sec receiver iperf Done. [/code] Sending to the TX1: [code] $iperf3 -c tegra-ubuntu.local -p 12345 -t 60 -i 10 -b 1G Connecting to host tegra-ubuntu.local, port 12345 [ 7] local fe80::10b9:e3af:1f20:c50a port 52944 connected to fe80::f236:19c3:65f4:d4c0 port 12345 [ ID] Interval Transfer Bitrate [ 7] 0.00-10.00 sec 399 MBytes 335 Mbits/sec [ 7] 10.00-20.00 sec 431 MBytes 361 Mbits/sec [ 7] 20.00-30.00 sec 478 MBytes 401 Mbits/sec [ 7] 30.00-40.00 sec 408 MBytes 342 Mbits/sec [ 7] 40.00-50.00 sec 399 MBytes 335 Mbits/sec [ 7] 50.00-60.00 sec 517 MBytes 434 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 7] 0.00-60.00 sec 2.57 GBytes 368 Mbits/sec sender [ 7] 0.00-60.00 sec 2.57 GBytes 368 Mbits/sec receiver iperf Done. [/code] I found this thread [url]https://devtalk.nvidia.com/default/topic/979635/jetson-tx1/ethernet-speed-increases-when-micro-usb-2-0-connector-is-connected/[/url] which has similar problems, but the apperent fix at the end of that thread is not applicable anymore due to changes in the kernel over the last year. A solid network performance is very important for the application of our company, so I need reliable speeds above 900MBits/sec. Anyone had the same problem and solved it for a JetPack 3.1 / L4T 28.1 installation?
Hi!

I encountered a lot very strange network performance behaviours with the TX1. To verify that it is not my personal setup a did the same with a TX1 on the TX1 development board and with a vanilla JetPack 3.1 installation on it.

When I use the internal r8152 network card and iperf3 to measure the performance I get around 300MBits/sec when sending to the device but full around 900MBits/sec when receiving from the device.

On the TX1:
nvidia@tegra-ubuntu:~$ iperf3 -s -p 12345


On any other working machine connected to the same GigE switch:
Receiving from the TX1
$iperf3 -c tegra-ubuntu.local -p 12345 -t 60 -i 10 -b 1G -R
Connecting to host tegra-ubuntu.local, port 12345
Reverse mode, remote host tegra-ubuntu.local is sending
[ 7] local fe80::10b9:e3af:1f20:c50a port 52937 connected to fe80::f236:19c3:65f4:d4c0 port 12345
[ ID] Interval Transfer Bitrate
[ 7] 0.00-10.00 sec 1.06 GBytes 911 Mbits/sec
[ 7] 10.00-20.00 sec 1.08 GBytes 928 Mbits/sec
[ 7] 20.00-30.00 sec 1.08 GBytes 928 Mbits/sec
[ 7] 30.00-40.00 sec 1.08 GBytes 928 Mbits/sec
[ 7] 40.00-50.00 sec 1.08 GBytes 928 Mbits/sec
[ 7] 50.00-60.00 sec 1.08 GBytes 928 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-60.00 sec 6.46 GBytes 925 Mbits/sec 0 sender
[ 7] 0.00-60.00 sec 6.46 GBytes 925 Mbits/sec receiver

iperf Done.


Sending to the TX1:
$iperf3 -c tegra-ubuntu.local -p 12345 -t 60 -i 10 -b 1G
Connecting to host tegra-ubuntu.local, port 12345
[ 7] local fe80::10b9:e3af:1f20:c50a port 52944 connected to fe80::f236:19c3:65f4:d4c0 port 12345
[ ID] Interval Transfer Bitrate
[ 7] 0.00-10.00 sec 399 MBytes 335 Mbits/sec
[ 7] 10.00-20.00 sec 431 MBytes 361 Mbits/sec
[ 7] 20.00-30.00 sec 478 MBytes 401 Mbits/sec
[ 7] 30.00-40.00 sec 408 MBytes 342 Mbits/sec
[ 7] 40.00-50.00 sec 399 MBytes 335 Mbits/sec
[ 7] 50.00-60.00 sec 517 MBytes 434 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 7] 0.00-60.00 sec 2.57 GBytes 368 Mbits/sec sender
[ 7] 0.00-60.00 sec 2.57 GBytes 368 Mbits/sec receiver

iperf Done.


I found this thread https://devtalk.nvidia.com/default/topic/979635/jetson-tx1/ethernet-speed-increases-when-micro-usb-2-0-connector-is-connected/ which has similar problems, but the apperent fix at the end of that thread is not applicable anymore due to changes in the kernel over the last year.

A solid network performance is very important for the application of our company, so I need reliable speeds above 900MBits/sec.

Anyone had the same problem and solved it for a JetPack 3.1 / L4T 28.1 installation?

#1
Posted 02/14/2018 07:20 PM   
This probably won't matter, but you may want to first be certain that "ifconfig" from both sides reports MTU the same (each direction can have a different MTU). Also, if you have traffic going through a router (or expensive managed switch) it might behave differently depending on direction. Do be sure it is a switch and not a router the traffic goes through (or perhaps even direct...a cross over cable is needed if both ends don't auto switch with non-crossover...and of course one end would have to also be a DHCP server in that case unless static IP addresses are used).
This probably won't matter, but you may want to first be certain that "ifconfig" from both sides reports MTU the same (each direction can have a different MTU). Also, if you have traffic going through a router (or expensive managed switch) it might behave differently depending on direction. Do be sure it is a switch and not a router the traffic goes through (or perhaps even direct...a cross over cable is needed if both ends don't auto switch with non-crossover...and of course one end would have to also be a DHCP server in that case unless static IP addresses are used).

#2
Posted 02/14/2018 08:17 PM   
Thanks for the tips, but bogus MTU settings were my very first idea as well. The switch really shouldn't be the problem since the Jetson and the "other" computer are basically the only users on that particular switch, with an uplink to the dhcp server. So was very much with you on the first level of error hunt! Interestingly I have no errors on the ifconfig outputs, but if I have use an iperf3 version which reports retries they are very high (three digits) for the TX1-receiving case. I have currently no access to the systems, but I can post some examples tomorrow.
Thanks for the tips, but bogus MTU settings were my very first idea as well. The switch really shouldn't be the problem since the Jetson and the "other" computer are basically the only users on that particular switch, with an uplink to the dhcp server. So was very much with you on the first level of error hunt!

Interestingly I have no errors on the ifconfig outputs, but if I have use an iperf3 version which reports retries they are very high (three digits) for the TX1-receiving case.

I have currently no access to the systems, but I can post some examples tomorrow.

#3
Posted 02/14/2018 09:47 PM   
An ifconfig listing for both systems after a session where there were errors would be enlightening. Before you start the traffic which has the errors you might want to run "dmesg --follow" so you can see if the kernel announces anything during that same session.
An ifconfig listing for both systems after a session where there were errors would be enlightening. Before you start the traffic which has the errors you might want to run "dmesg --follow" so you can see if the kernel announces anything during that same session.

#4
Posted 02/14/2018 09:50 PM   
Today I did a full range of tests on the hardware available to me, which at the moment are: [list] [.]TX1 on a NVIDIA Development board running a clean Jetpack[/.] [.]TX1 on an Auvidea J120 running a custom 4.4.38 kernel and an ubuntu 16.04 base system[/.] [.]TX2 on an Auvidea J140 running a custom 4.4.38 kernel and an ubuntu 16.04 base system[/.] [.]Various PCs on ubuntu and one MacBook Pro[/.] [/list] During all the tests the computers involved were connected to the same gigabit ethernet switch with only infrastructure (DHCP server) coming in on one port. I also tested 3 different switches: [olist] [.]TP Link TL-SG105E (5 Port GigE)[/.] [.]CyberData 011236A Embedded (3 Port GigE)[/.] [.]Netgear ProSafe GS108 (8 Port GigE)[/.] [/olist] The bandwidth to and from the TX1 (no big difference between the Nvidia Dev board and the auvidea J120) varies but never reaches the expected 900MBits/sec in both directions. Interestingly there were no kernel messages related to the network whatsoever. Two good examples are between two TX1 and the TX1 and the TX2. In the two outputs below, the server was always running on the other side and the client on the TX1 on the Nvidia Development board (ubuntu-tegra). At the moment I am a bit puzzled, since the only HW option left are cables. I ordered a set of highend Cat 6a cables, which hopefully will arrive befor the weekend, but I really think there is something else wrong here, since the TX behaves pretty well on a cable and port the TX1s make funny things. I have the feeling that there is something wrong down the line of the USB ethernet conversion. Otherwise I can't explain the really huge amount of retries as well. Any ideas from anyone?!? [b]Connected to TX1 on Auvidea J120:[/b] [code] X1J120 Before: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:5a:ec:56 inet addr:192.168.1.29 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::204:4bff:fe5a:ec56/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4138 errors:0 dropped:0 overruns:0 frame:0 TX packets:355 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:690435 (690.4 KB) TX bytes:117975 (117.9 KB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:162 errors:0 dropped:0 overruns:0 frame:0 TX packets:162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB) TX1DEV before: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84 inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:13023348 errors:0 dropped:0 overruns:0 frame:0 TX packets:4775656 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:18478442603 (18.4 GB) TX bytes:8620983329 (8.6 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:234 errors:0 dropped:0 overruns:0 frame:0 TX packets:234 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB) wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Results: $ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G Connecting to host tegra-ubuntu, port 12345 [ 4] local 192.168.1.29 port 41176 connected to 192.168.1.26 port 12345 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-10.00 sec 874 MBytes 733 Mbits/sec 297 215 KBytes [ 4] 10.00-20.00 sec 887 MBytes 744 Mbits/sec 211 283 KBytes [ 4] 20.00-30.00 sec 882 MBytes 740 Mbits/sec 267 262 KBytes [ 4] 30.00-40.00 sec 875 MBytes 734 Mbits/sec 258 249 KBytes [ 4] 40.00-50.00 sec 877 MBytes 736 Mbits/sec 278 235 KBytes [ 4] 50.00-60.00 sec 875 MBytes 734 Mbits/sec 300 256 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-60.00 sec 5.15 GBytes 737 Mbits/sec 1611 sender [ 4] 0.00-60.00 sec 5.14 GBytes 737 Mbits/sec receiver iperf Done. $ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G -R Connecting to host tegra-ubuntu, port 12345 Reverse mode, remote host tegra-ubuntu is sending [ 4] local 192.168.1.29 port 41180 connected to 192.168.1.26 port 12345 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 799 MBytes 670 Mbits/sec [ 4] 10.00-20.00 sec 806 MBytes 677 Mbits/sec [ 4] 20.00-30.00 sec 814 MBytes 683 Mbits/sec [ 4] 30.00-40.00 sec 822 MBytes 689 Mbits/sec [ 4] 40.00-50.00 sec 821 MBytes 688 Mbits/sec [ 4] 50.00-60.00 sec 821 MBytes 689 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-60.00 sec 4.77 GBytes 683 Mbits/sec 5294 sender [ 4] 0.00-60.00 sec 4.77 GBytes 683 Mbits/sec receiver iperf Done. TX1J120 after: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84 inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:18627771 errors:0 dropped:0 overruns:0 frame:0 TX packets:7065839 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:24373140690 (24.3 GB) TX bytes:13900665875 (13.9 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:234 errors:0 dropped:0 overruns:0 frame:0 TX packets:234 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB) wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) TX1DEV After: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:5a:ec:56 inet addr:192.168.1.29 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::204:4bff:fe5a:ec56/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:5458206 errors:0 dropped:0 overruns:0 frame:0 TX packets:2169169 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:5481488008 (5.4 GB) TX bytes:5670394266 (5.6 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:162 errors:0 dropped:0 overruns:0 frame:0 TX packets:162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB) [/code] [b]Connected to TX2 on Auvidea J140 (realtek port):[/b] [code] TX2J140 before: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:8d:46:88 inet addr:192.168.1.37 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::204:4bff:fe8d:4688/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3662 errors:0 dropped:0 overruns:0 frame:0 TX packets:845 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:723092 (723.0 KB) TX bytes:171853 (171.8 KB) Interrupt:42 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:162 errors:0 dropped:0 overruns:0 frame:0 TX packets:162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB) TX1DEV before: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84 inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:8955188 errors:0 dropped:0 overruns:0 frame:0 TX packets:2601581 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:13159965433 (13.1 GB) TX bytes:1603308787 (1.6 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:234 errors:0 dropped:0 overruns:0 frame:0 TX packets:234 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB) wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Result: $ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G Connecting to host tegra-ubuntu, port 12345 [ 4] local 192.168.1.37 port 40746 connected to 192.168.1.26 port 12345 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-10.00 sec 806 MBytes 676 Mbits/sec 538 253 KBytes [ 4] 10.00-20.00 sec 811 MBytes 681 Mbits/sec 587 276 KBytes [ 4] 20.00-30.00 sec 804 MBytes 675 Mbits/sec 474 284 KBytes [ 4] 30.00-40.00 sec 817 MBytes 685 Mbits/sec 472 270 KBytes [ 4] 40.00-50.00 sec 788 MBytes 661 Mbits/sec 458 264 KBytes [ 4] 50.00-60.00 sec 791 MBytes 663 Mbits/sec 505 281 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-60.00 sec 4.70 GBytes 673 Mbits/sec 3034 sender [ 4] 0.00-60.00 sec 4.70 GBytes 673 Mbits/sec receiver iperf Done. $ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G -R Connecting to host tegra-ubuntu, port 12345 Reverse mode, remote host tegra-ubuntu is sending [ 4] local 192.168.1.37 port 40750 connected to 192.168.1.26 port 12345 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 1.05 GBytes 903 Mbits/sec [ 4] 10.00-20.00 sec 1.06 GBytes 908 Mbits/sec [ 4] 20.00-30.00 sec 1.04 GBytes 896 Mbits/sec [ 4] 30.00-40.00 sec 1.07 GBytes 918 Mbits/sec [ 4] 40.00-50.00 sec 1.10 GBytes 941 Mbits/sec [ 4] 50.00-60.00 sec 1.08 GBytes 932 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-60.00 sec 6.40 GBytes 917 Mbits/sec 360 sender [ 4] 0.00-60.00 sec 6.40 GBytes 916 Mbits/sec receiver iperf Done. TX2J140 after: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:8d:46:88 inet addr:192.168.1.37 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::204:4bff:fe8d:4688/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:6491798 errors:0 dropped:0 overruns:0 frame:0 TX packets:691697 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:7302589896 (7.3 GB) TX bytes:10984071550 (10.9 GB) Interrupt:42 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:162 errors:0 dropped:0 overruns:0 frame:0 TX packets:162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB) TX1DEV after: /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84 inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:13023329 errors:0 dropped:0 overruns:0 frame:0 TX packets:4775648 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:18478440579 (18.4 GB) TX bytes:8620981145 (8.6 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:234 errors:0 dropped:0 overruns:0 frame:0 TX packets:234 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB) wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) [/code] [b]dmesg output[/b] [code] (...) [ 6.710355] systemd[1]: Created slice User and Session Slice. [ 6.718499] systemd[1]: Listening on Syslog Socket. [ 6.729052] systemd[1]: Listening on udev Control Socket. [ 6.736717] systemd[1]: Reached target User and Group Name Lookups. [ 6.745338] systemd[1]: Listening on Journal Audit Socket. [ 6.753056] systemd[1]: Listening on udev Kernel Socket. [ 6.760560] systemd[1]: Listening on Journal Socket (/dev/log). [ 6.768643] systemd[1]: Listening on LVM2 poll daemon socket. [ 6.776543] systemd[1]: Created slice System Slice. [ 6.783477] systemd[1]: Reached target Slices. [ 6.789878] systemd[1]: Reached target Swap. [ 6.796030] systemd[1]: Reached target Encrypted Volumes. [ 6.803360] systemd[1]: Listening on /dev/initctl Compatibility Named Pipe. [ 6.812234] systemd[1]: Listening on LVM2 metadata daemon socket. [ 6.820318] systemd[1]: Created slice system-serial\x2dgetty.slice. [ 6.832016] systemd[1]: Listening on Journal Socket. [ 6.839530] systemd[1]: Started Braille Device Support. [ 6.847829] systemd[1]: Starting Create list of required static device nodes for the current kernel... [ 6.853306] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 6.868239] systemd[1]: Starting Journal Service... [ 6.879745] systemd[1]: Mounting Debug File System... [ 6.889222] systemd[1]: Starting Remount Root and Kernel File Systems... [ 6.898194] systemd[1]: Reached target Remote File Systems (Pre). [ 6.906746] systemd[1]: Reached target Remote File Systems. [ 6.914497] systemd[1]: Listening on Device-mapper event daemon FIFOs. [ 6.923956] systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling... [ 6.941051] systemd[1]: Started Forward Password Requests to Wall Directory Watch. [ 6.952485] systemd[1]: Starting Set console keymap... [ 6.968416] systemd[1]: Starting Load Kernel Modules... [ 6.977348] systemd[1]: Started Create list of required static device nodes for the current kernel. [ 6.989988] systemd[1]: Started Remount Root and Kernel File Systems. [ 7.014442] systemd[1]: Starting udev Coldplug all Devices... [ 7.023545] systemd[1]: Starting Load/Save Random Seed... [ 7.032354] systemd[1]: Starting Create Static Device Nodes in /dev... [ 7.045984] systemd[1]: Mounted Debug File System. [ 7.055938] systemd[1]: Started Load Kernel Modules. [ 7.065118] systemd[1]: Started Load/Save Random Seed. [ 7.073213] systemd[1]: Started Journal Service. [ 7.135448] systemd-journald[207]: Received request to flush runtime journal from PID 1 [ 7.283307] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 7.368054] xhci-tegra 70090000.xusb: cannot find firmware....retry after 1 second [ 7.619213] dhd_module_init in [ 7.619407] found wifi platform device bcmdhd_wlan [ 7.620444] Power-up adapter 'DHD generic adapter' [ 7.620463] wifi_platform_set_power = 1 [ 7.705613] random: nonblocking pool is initialized [ 7.717811] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 7.727973] tegra-pcie 1003000.pcie-controller: link 0 down, ignoring [ 7.823313] wifi_platform_bus_enumerate device present 1 [ 7.857626] wifi_platform_bus_enumerate device present 0 [ 7.881198] F1 signature read @0x18000000=0x17214354 [ 7.939060] F1 signature OK, socitype:0x1 chip:0x4354 rev:0x1 pkg:0x2 [ 7.939789] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000 [ 7.939864] wifi_platform_prealloc: failed to alloc static mem section 7 [ 7.939872] wifi_platform_get_mac_addr [ 7.953641] CFG80211-ERROR) wl_setup_wiphy : Registering Vendor80211 [ 7.955888] wl_create_event_handler(): thread:wl_event_handler:210 started [ 7.956003] CFG80211-ERROR) wl_event_handler : tsk Enter, tsk = 0xffffffc07b601a70 [ 7.963796] dhd_attach(): thread:dhd_watchdog_thread:213 started [ 7.963941] dhd_attach(): thread:dhd_dpc:21c started [ 7.963985] dhd_attach(): thread:dhd_rxf:21d started [ 7.963990] dhd_deferred_work_init: work queue initialized [ 7.964253] Dongle Host Driver, version 1.201.82 (r) Compiled in drivers/net/wireless/bcmdhd on Jul 20 2017 at 00:39:01 [ 7.964622] tegra_sysfs_register [ 7.964670] Register interface [wlan0] MAC: 00:04:4b:a1:dd:82 [ 7.964673] dhd_prot_ioctl : bus is down. we have nothing to do [ 7.965278] sdhci-tegra sdhci-tegra.1: Tuning already done, restoring the best tap value : 85 [ 7.966329] wifi_platform_set_power = 0 [ 8.148414] tegra-pcie 1003000.pcie-controller: link 1 down, retrying [ 8.391952] xhci-tegra 70090000.xusb: Firmware timestamp: 2016-11-24 02:31:08 UTC, Version: 50.18 release [ 8.411542] xhci-tegra 70090000.xusb: xHCI Host Controller [ 8.419375] xhci-tegra 70090000.xusb: new USB bus registered, assigned bus number 1 [ 8.430631] xhci-tegra 70090000.xusb: hcc params 0x0184f525 hci version 0x100 quirks 0x00010810 [ 8.441958] xhci-tegra 70090000.xusb: irq 319, io mem 0x70090000 [ 8.450252] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 [ 8.450256] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 8.450259] usb usb1: Product: xHCI Host Controller [ 8.450261] usb usb1: Manufacturer: Linux 4.4.38-tegra xhci-hcd [ 8.450263] usb usb1: SerialNumber: 70090000.xusb [ 8.450693] hub 1-0:1.0: USB hub found [ 8.450718] hub 1-0:1.0: 5 ports detected [ 8.479353] xhci-tegra 70090000.xusb: xHCI Host Controller [ 8.479362] xhci-tegra 70090000.xusb: new USB bus registered, assigned bus number 2 [ 8.479532] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003 [ 8.479535] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 8.479538] usb usb2: Product: xHCI Host Controller [ 8.479540] usb usb2: Manufacturer: Linux 4.4.38-tegra xhci-hcd [ 8.479542] usb usb2: SerialNumber: 70090000.xusb [ 8.484705] hub 2-0:1.0: USB hub found [ 8.484742] hub 2-0:1.0: 4 ports detected [ 8.549455] tegra-pcie 1003000.pcie-controller: link 1 down, retrying [ 8.795807] usb 2-1: new SuperSpeed USB device number 2 using xhci-tegra [ 8.812575] usb 2-1: New USB device found, idVendor=0955, idProduct=09ff [ 8.812579] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6 [ 8.812581] usb 2-1: Product: USB 10/100/1000 LAN [ 8.812584] usb 2-1: Manufacturer: Nvidia [ 8.812586] usb 2-1: SerialNumber: 000001000000 [ 8.813257] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6 [ 8.815503] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6 [ 8.816048] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6 [ 8.848965] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 8.849070] Dongle Host Driver, version 1.201.82 (r) Compiled in drivers/net/wireless/bcmdhd on Jul 20 2017 at 00:39:01 [ 8.849072] wl_android_wifi_on in [ 8.849075] wifi_platform_set_power = 1 [ 8.928225] usb 2-1: reset SuperSpeed USB device number 2 using xhci-tegra [ 8.961316] tegra-pcie 1003000.pcie-controller: link 1 down, retrying [ 8.971314] tegra-pcie 1003000.pcie-controller: link 1 down, ignoring [ 8.978718] tegra-pcie 1003000.pcie-controller: PCIE: no end points detected [ 8.987556] tegra-pcie 1003000.pcie-controller: PCIE: Disable power rails [ 9.120684] mmc1: queuing unknown CIS tuple 0x80 (5 bytes) [ 9.206034] sdhci-tegra sdhci-tegra.1: Tuning already done, restoring the best tap value : 85 [ 9.219304] F1 signature read @0x18000000=0x17214354 [ 9.229684] F1 signature OK, socitype:0x1 chip:0x4354 rev:0x1 pkg:0x2 [ 9.236662] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000 [ 9.295782] dhdsdio_write_vars: Download, Upload and compare of NVRAM succeeded. [ 9.344664] dhd_bus_init: enable 0x06, ready 0x06 (waited 0us) [ 9.351819] wifi_platform_get_mac_addr [ 9.361085] Firmware up: op_mode=0x0005, MAC=00:04:4b:a1:dd:82 [ 9.372653] dhd_preinit_ioctls pspretend_threshold for HostAPD failed -23 [ 9.384015] Firmware version = wl0: Sep 14 2016 11:38:27 version 7.35.221.18 (r657725) FWID 01-9001dfb5 [ 9.395888] dhd_interworking_enable: failed to set WNM info, ret=-23 [ 9.402439] tegra_sysfs_on [ 9.463800] r8152 2-1:1.0 eth0: v2.03.3 (2015/01/29) [ 9.469156] r8152 2-1:1.0 eth0: This product is covered by one or more of the following patents: US6,570,884, US6,115,776, and US6,327,625. [ 9.523478] CFGP2P-ERROR) wl_cfgp2p_add_p2p_disc_if : P2P interface registered [ 9.546776] WLC_E_IF: NO_IF set, event Ignored [ 9.820592] cfg80211: World regulatory domain updated: [ 9.829773] cfg80211: DFS Master region: unset [ 9.834176] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time) [ 9.843989] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A) [ 9.852019] cfg80211: (2457000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A) [ 9.860035] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm), (N/A) [ 9.868059] cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (N/A) [ 9.877550] cfg80211: (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (0 s) [ 9.887057] cfg80211: (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2000 mBm), (0 s) [ 9.895167] cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 2000 mBm), (N/A) [ 9.903182] cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 0 mBm), (N/A) [ 10.671092] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 10.763495] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 10.932103] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 10.984605] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.018068] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.063481] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.071824] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.080787] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.120737] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.129083] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.148850] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.186197] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 13.637089] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 14.549199] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 52.772777] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead. [ 54.745951] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 61.314019] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 68.172535] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 69.313281] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 78.228185] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 79.573457] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 97.844008] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 127.032820] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [/code]
Today I did a full range of tests on the hardware available to me, which at the moment are:
  • TX1 on a NVIDIA Development board running a clean Jetpack
  • TX1 on an Auvidea J120 running a custom 4.4.38 kernel and an ubuntu 16.04 base system
  • TX2 on an Auvidea J140 running a custom 4.4.38 kernel and an ubuntu 16.04 base system
  • Various PCs on ubuntu and one MacBook Pro


During all the tests the computers involved were connected to the same gigabit ethernet switch with only infrastructure (DHCP server) coming in on one port. I also tested 3 different switches:

  1. TP Link TL-SG105E (5 Port GigE)
  2. CyberData 011236A Embedded (3 Port GigE)
  3. Netgear ProSafe GS108 (8 Port GigE)


The bandwidth to and from the TX1 (no big difference between the Nvidia Dev board and the auvidea J120) varies but never reaches the expected 900MBits/sec in both directions.

Interestingly there were no kernel messages related to the network whatsoever.

Two good examples are between two TX1 and the TX1 and the TX2. In the two outputs below, the server was always running on the other side and the client on the TX1 on the Nvidia Development board (ubuntu-tegra).

At the moment I am a bit puzzled, since the only HW option left are cables. I ordered a set of highend Cat 6a cables, which hopefully will arrive befor the weekend, but I really think there is something else wrong here, since the TX behaves pretty well on a cable and port the TX1s make funny things. I have the feeling that there is something wrong down the line of the USB ethernet conversion. Otherwise I can't explain the really huge amount of retries as well.

Any ideas from anyone?!?

Connected to TX1 on Auvidea J120:
X1J120 Before:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:5a:ec:56
inet addr:192.168.1.29 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::204:4bff:fe5a:ec56/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:4138 errors:0 dropped:0 overruns:0 frame:0
TX packets:355 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:690435 (690.4 KB) TX bytes:117975 (117.9 KB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:162 errors:0 dropped:0 overruns:0 frame:0
TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB)

TX1DEV before:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84
inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:13023348 errors:0 dropped:0 overruns:0 frame:0
TX packets:4775656 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:18478442603 (18.4 GB) TX bytes:8620983329 (8.6 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:234 errors:0 dropped:0 overruns:0 frame:0
TX packets:234 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB)

wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

Results:

$ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G
Connecting to host tegra-ubuntu, port 12345
[ 4] local 192.168.1.29 port 41176 connected to 192.168.1.26 port 12345
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-10.00 sec 874 MBytes 733 Mbits/sec 297 215 KBytes
[ 4] 10.00-20.00 sec 887 MBytes 744 Mbits/sec 211 283 KBytes
[ 4] 20.00-30.00 sec 882 MBytes 740 Mbits/sec 267 262 KBytes
[ 4] 30.00-40.00 sec 875 MBytes 734 Mbits/sec 258 249 KBytes
[ 4] 40.00-50.00 sec 877 MBytes 736 Mbits/sec 278 235 KBytes
[ 4] 50.00-60.00 sec 875 MBytes 734 Mbits/sec 300 256 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 5.15 GBytes 737 Mbits/sec 1611 sender
[ 4] 0.00-60.00 sec 5.14 GBytes 737 Mbits/sec receiver

iperf Done.
$ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G -R
Connecting to host tegra-ubuntu, port 12345
Reverse mode, remote host tegra-ubuntu is sending
[ 4] local 192.168.1.29 port 41180 connected to 192.168.1.26 port 12345
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 799 MBytes 670 Mbits/sec
[ 4] 10.00-20.00 sec 806 MBytes 677 Mbits/sec
[ 4] 20.00-30.00 sec 814 MBytes 683 Mbits/sec
[ 4] 30.00-40.00 sec 822 MBytes 689 Mbits/sec
[ 4] 40.00-50.00 sec 821 MBytes 688 Mbits/sec
[ 4] 50.00-60.00 sec 821 MBytes 689 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 4.77 GBytes 683 Mbits/sec 5294 sender
[ 4] 0.00-60.00 sec 4.77 GBytes 683 Mbits/sec receiver

iperf Done.

TX1J120 after:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84
inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:18627771 errors:0 dropped:0 overruns:0 frame:0
TX packets:7065839 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:24373140690 (24.3 GB) TX bytes:13900665875 (13.9 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:234 errors:0 dropped:0 overruns:0 frame:0
TX packets:234 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB)

wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)


TX1DEV After:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:5a:ec:56
inet addr:192.168.1.29 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::204:4bff:fe5a:ec56/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:5458206 errors:0 dropped:0 overruns:0 frame:0
TX packets:2169169 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:5481488008 (5.4 GB) TX bytes:5670394266 (5.6 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:162 errors:0 dropped:0 overruns:0 frame:0
TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB)


Connected to TX2 on Auvidea J140 (realtek port):
TX2J140 before:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:8d:46:88
inet addr:192.168.1.37 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::204:4bff:fe8d:4688/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3662 errors:0 dropped:0 overruns:0 frame:0
TX packets:845 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:723092 (723.0 KB) TX bytes:171853 (171.8 KB)
Interrupt:42

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:162 errors:0 dropped:0 overruns:0 frame:0
TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB)

TX1DEV before:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84
inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8955188 errors:0 dropped:0 overruns:0 frame:0
TX packets:2601581 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:13159965433 (13.1 GB) TX bytes:1603308787 (1.6 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:234 errors:0 dropped:0 overruns:0 frame:0
TX packets:234 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB)

wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

Result:

$ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G
Connecting to host tegra-ubuntu, port 12345
[ 4] local 192.168.1.37 port 40746 connected to 192.168.1.26 port 12345
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-10.00 sec 806 MBytes 676 Mbits/sec 538 253 KBytes
[ 4] 10.00-20.00 sec 811 MBytes 681 Mbits/sec 587 276 KBytes
[ 4] 20.00-30.00 sec 804 MBytes 675 Mbits/sec 474 284 KBytes
[ 4] 30.00-40.00 sec 817 MBytes 685 Mbits/sec 472 270 KBytes
[ 4] 40.00-50.00 sec 788 MBytes 661 Mbits/sec 458 264 KBytes
[ 4] 50.00-60.00 sec 791 MBytes 663 Mbits/sec 505 281 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 4.70 GBytes 673 Mbits/sec 3034 sender
[ 4] 0.00-60.00 sec 4.70 GBytes 673 Mbits/sec receiver

iperf Done.
$ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G -R
Connecting to host tegra-ubuntu, port 12345
Reverse mode, remote host tegra-ubuntu is sending
[ 4] local 192.168.1.37 port 40750 connected to 192.168.1.26 port 12345
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 1.05 GBytes 903 Mbits/sec
[ 4] 10.00-20.00 sec 1.06 GBytes 908 Mbits/sec
[ 4] 20.00-30.00 sec 1.04 GBytes 896 Mbits/sec
[ 4] 30.00-40.00 sec 1.07 GBytes 918 Mbits/sec
[ 4] 40.00-50.00 sec 1.10 GBytes 941 Mbits/sec
[ 4] 50.00-60.00 sec 1.08 GBytes 932 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 6.40 GBytes 917 Mbits/sec 360 sender
[ 4] 0.00-60.00 sec 6.40 GBytes 916 Mbits/sec receiver

iperf Done.

TX2J140 after:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:8d:46:88
inet addr:192.168.1.37 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::204:4bff:fe8d:4688/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6491798 errors:0 dropped:0 overruns:0 frame:0
TX packets:691697 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:7302589896 (7.3 GB) TX bytes:10984071550 (10.9 GB)
Interrupt:42

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:162 errors:0 dropped:0 overruns:0 frame:0
TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB)

TX1DEV after:

/sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84
inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:13023329 errors:0 dropped:0 overruns:0 frame:0
TX packets:4775648 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:18478440579 (18.4 GB) TX bytes:8620981145 (8.6 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:234 errors:0 dropped:0 overruns:0 frame:0
TX packets:234 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB)

wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)



dmesg output

(...)
[ 6.710355] systemd[1]: Created slice User and Session Slice.
[ 6.718499] systemd[1]: Listening on Syslog Socket.
[ 6.729052] systemd[1]: Listening on udev Control Socket.
[ 6.736717] systemd[1]: Reached target User and Group Name Lookups.
[ 6.745338] systemd[1]: Listening on Journal Audit Socket.
[ 6.753056] systemd[1]: Listening on udev Kernel Socket.
[ 6.760560] systemd[1]: Listening on Journal Socket (/dev/log).
[ 6.768643] systemd[1]: Listening on LVM2 poll daemon socket.
[ 6.776543] systemd[1]: Created slice System Slice.
[ 6.783477] systemd[1]: Reached target Slices.
[ 6.789878] systemd[1]: Reached target Swap.
[ 6.796030] systemd[1]: Reached target Encrypted Volumes.
[ 6.803360] systemd[1]: Listening on /dev/initctl Compatibility Named Pipe.
[ 6.812234] systemd[1]: Listening on LVM2 metadata daemon socket.
[ 6.820318] systemd[1]: Created slice system-serial\x2dgetty.slice.
[ 6.832016] systemd[1]: Listening on Journal Socket.
[ 6.839530] systemd[1]: Started Braille Device Support.
[ 6.847829] systemd[1]: Starting Create list of required static device nodes for the current kernel...
[ 6.853306] tegra-pcie 1003000.pcie-controller: link 0 down, retrying
[ 6.868239] systemd[1]: Starting Journal Service...
[ 6.879745] systemd[1]: Mounting Debug File System...
[ 6.889222] systemd[1]: Starting Remount Root and Kernel File Systems...
[ 6.898194] systemd[1]: Reached target Remote File Systems (Pre).
[ 6.906746] systemd[1]: Reached target Remote File Systems.
[ 6.914497] systemd[1]: Listening on Device-mapper event daemon FIFOs.
[ 6.923956] systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
[ 6.941051] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[ 6.952485] systemd[1]: Starting Set console keymap...
[ 6.968416] systemd[1]: Starting Load Kernel Modules...
[ 6.977348] systemd[1]: Started Create list of required static device nodes for the current kernel.
[ 6.989988] systemd[1]: Started Remount Root and Kernel File Systems.
[ 7.014442] systemd[1]: Starting udev Coldplug all Devices...
[ 7.023545] systemd[1]: Starting Load/Save Random Seed...
[ 7.032354] systemd[1]: Starting Create Static Device Nodes in /dev...
[ 7.045984] systemd[1]: Mounted Debug File System.
[ 7.055938] systemd[1]: Started Load Kernel Modules.
[ 7.065118] systemd[1]: Started Load/Save Random Seed.
[ 7.073213] systemd[1]: Started Journal Service.
[ 7.135448] systemd-journald[207]: Received request to flush runtime journal from PID 1
[ 7.283307] tegra-pcie 1003000.pcie-controller: link 0 down, retrying
[ 7.368054] xhci-tegra 70090000.xusb: cannot find firmware....retry after 1 second
[ 7.619213] dhd_module_init in
[ 7.619407] found wifi platform device bcmdhd_wlan
[ 7.620444] Power-up adapter 'DHD generic adapter'
[ 7.620463] wifi_platform_set_power = 1
[ 7.705613] random: nonblocking pool is initialized
[ 7.717811] tegra-pcie 1003000.pcie-controller: link 0 down, retrying
[ 7.727973] tegra-pcie 1003000.pcie-controller: link 0 down, ignoring
[ 7.823313] wifi_platform_bus_enumerate device present 1
[ 7.857626] wifi_platform_bus_enumerate device present 0
[ 7.881198] F1 signature read @0x18000000=0x17214354
[ 7.939060] F1 signature OK, socitype:0x1 chip:0x4354 rev:0x1 pkg:0x2
[ 7.939789] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000
[ 7.939864] wifi_platform_prealloc: failed to alloc static mem section 7
[ 7.939872] wifi_platform_get_mac_addr
[ 7.953641] CFG80211-ERROR) wl_setup_wiphy : Registering Vendor80211
[ 7.955888] wl_create_event_handler(): thread:wl_event_handler:210 started
[ 7.956003] CFG80211-ERROR) wl_event_handler : tsk Enter, tsk = 0xffffffc07b601a70
[ 7.963796] dhd_attach(): thread:dhd_watchdog_thread:213 started
[ 7.963941] dhd_attach(): thread:dhd_dpc:21c started
[ 7.963985] dhd_attach(): thread:dhd_rxf:21d started
[ 7.963990] dhd_deferred_work_init: work queue initialized
[ 7.964253] Dongle Host Driver, version 1.201.82 (r)
Compiled in drivers/net/wireless/bcmdhd on Jul 20 2017 at 00:39:01
[ 7.964622] tegra_sysfs_register
[ 7.964670] Register interface [wlan0] MAC: 00:04:4b:a1:dd:82

[ 7.964673] dhd_prot_ioctl : bus is down. we have nothing to do
[ 7.965278] sdhci-tegra sdhci-tegra.1: Tuning already done, restoring the best tap value : 85
[ 7.966329] wifi_platform_set_power = 0
[ 8.148414] tegra-pcie 1003000.pcie-controller: link 1 down, retrying
[ 8.391952] xhci-tegra 70090000.xusb: Firmware timestamp: 2016-11-24 02:31:08 UTC, Version: 50.18 release
[ 8.411542] xhci-tegra 70090000.xusb: xHCI Host Controller
[ 8.419375] xhci-tegra 70090000.xusb: new USB bus registered, assigned bus number 1
[ 8.430631] xhci-tegra 70090000.xusb: hcc params 0x0184f525 hci version 0x100 quirks 0x00010810
[ 8.441958] xhci-tegra 70090000.xusb: irq 319, io mem 0x70090000
[ 8.450252] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[ 8.450256] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 8.450259] usb usb1: Product: xHCI Host Controller
[ 8.450261] usb usb1: Manufacturer: Linux 4.4.38-tegra xhci-hcd
[ 8.450263] usb usb1: SerialNumber: 70090000.xusb
[ 8.450693] hub 1-0:1.0: USB hub found
[ 8.450718] hub 1-0:1.0: 5 ports detected
[ 8.479353] xhci-tegra 70090000.xusb: xHCI Host Controller
[ 8.479362] xhci-tegra 70090000.xusb: new USB bus registered, assigned bus number 2
[ 8.479532] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003
[ 8.479535] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 8.479538] usb usb2: Product: xHCI Host Controller
[ 8.479540] usb usb2: Manufacturer: Linux 4.4.38-tegra xhci-hcd
[ 8.479542] usb usb2: SerialNumber: 70090000.xusb
[ 8.484705] hub 2-0:1.0: USB hub found
[ 8.484742] hub 2-0:1.0: 4 ports detected
[ 8.549455] tegra-pcie 1003000.pcie-controller: link 1 down, retrying
[ 8.795807] usb 2-1: new SuperSpeed USB device number 2 using xhci-tegra
[ 8.812575] usb 2-1: New USB device found, idVendor=0955, idProduct=09ff
[ 8.812579] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[ 8.812581] usb 2-1: Product: USB 10/100/1000 LAN
[ 8.812584] usb 2-1: Manufacturer: Nvidia
[ 8.812586] usb 2-1: SerialNumber: 000001000000
[ 8.813257] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6
[ 8.815503] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6
[ 8.816048] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6
[ 8.848965] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 8.849070]
Dongle Host Driver, version 1.201.82 (r)
Compiled in drivers/net/wireless/bcmdhd on Jul 20 2017 at 00:39:01
[ 8.849072] wl_android_wifi_on in
[ 8.849075] wifi_platform_set_power = 1
[ 8.928225] usb 2-1: reset SuperSpeed USB device number 2 using xhci-tegra
[ 8.961316] tegra-pcie 1003000.pcie-controller: link 1 down, retrying
[ 8.971314] tegra-pcie 1003000.pcie-controller: link 1 down, ignoring
[ 8.978718] tegra-pcie 1003000.pcie-controller: PCIE: no end points detected
[ 8.987556] tegra-pcie 1003000.pcie-controller: PCIE: Disable power rails
[ 9.120684] mmc1: queuing unknown CIS tuple 0x80 (5 bytes)
[ 9.206034] sdhci-tegra sdhci-tegra.1: Tuning already done, restoring the best tap value : 85
[ 9.219304] F1 signature read @0x18000000=0x17214354
[ 9.229684] F1 signature OK, socitype:0x1 chip:0x4354 rev:0x1 pkg:0x2
[ 9.236662] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000
[ 9.295782] dhdsdio_write_vars: Download, Upload and compare of NVRAM succeeded.
[ 9.344664] dhd_bus_init: enable 0x06, ready 0x06 (waited 0us)
[ 9.351819] wifi_platform_get_mac_addr
[ 9.361085] Firmware up: op_mode=0x0005, MAC=00:04:4b:a1:dd:82
[ 9.372653] dhd_preinit_ioctls pspretend_threshold for HostAPD failed -23
[ 9.384015] Firmware version = wl0: Sep 14 2016 11:38:27 version 7.35.221.18 (r657725) FWID 01-9001dfb5
[ 9.395888] dhd_interworking_enable: failed to set WNM info, ret=-23
[ 9.402439] tegra_sysfs_on
[ 9.463800] r8152 2-1:1.0 eth0: v2.03.3 (2015/01/29)
[ 9.469156] r8152 2-1:1.0 eth0: This product is covered by one or more of the following patents:
US6,570,884, US6,115,776, and US6,327,625.

[ 9.523478] CFGP2P-ERROR) wl_cfgp2p_add_p2p_disc_if : P2P interface registered
[ 9.546776] WLC_E_IF: NO_IF set, event Ignored
[ 9.820592] cfg80211: World regulatory domain updated:
[ 9.829773] cfg80211: DFS Master region: unset
[ 9.834176] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
[ 9.843989] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
[ 9.852019] cfg80211: (2457000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
[ 9.860035] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm), (N/A)
[ 9.868059] cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (N/A)
[ 9.877550] cfg80211: (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (0 s)
[ 9.887057] cfg80211: (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2000 mBm), (0 s)
[ 9.895167] cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 2000 mBm), (N/A)
[ 9.903182] cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 0 mBm), (N/A)
[ 10.671092] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 10.763495] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 10.932103] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 10.984605] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.018068] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.063481] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.071824] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.080787] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.120737] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.129083] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.148850] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.186197] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 13.637089] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 14.549199] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 52.772777] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead.
[ 54.745951] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 61.314019] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 68.172535] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 69.313281] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 78.228185] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 79.573457] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 97.844008] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 127.032820] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000

#5
Posted 02/15/2018 05:13 PM   
I see "ifconfig" with the "before". Is there any "ifconfig" which is after things have finished which show any kind of error? EDIT: Forgot to ask, have you run the "~ubuntu/jetson_clocks.sh" program to bring up core speed before testing?
I see "ifconfig" with the "before". Is there any "ifconfig" which is after things have finished which show any kind of error?

EDIT: Forgot to ask, have you run the "~ubuntu/jetson_clocks.sh" program to bring up core speed before testing?

#6
Posted 02/15/2018 09:04 PM   
[quote=""]I see "ifconfig" with the "before". Is there any "ifconfig" which is after things have finished which show any kind of error? [/quote] There is an ifconfig output from at the beginning and one after the test, directly behind the iperf3 output(s) for each of the two examples... [quote=""]EDIT: Forgot to ask, have you run the "~ubuntu/jetson_clocks.sh" program to bring up core speed before testing?[/quote] Ooops, I definitely should have written that! The primary receiver, Jetson TX1 on the Nvidia Development board, was running on full speed. The other two weren't.
said:I see "ifconfig" with the "before". Is there any "ifconfig" which is after things have finished which show any kind of error?

There is an ifconfig output from at the beginning and one after the test, directly behind the iperf3 output(s) for each of the two examples...
said:EDIT: Forgot to ask, have you run the "~ubuntu/jetson_clocks.sh" program to bring up core speed before testing?

Ooops, I definitely should have written that! The primary receiver, Jetson TX1 on the Nvidia Development board, was running on full speed. The other two weren't.

#7
Posted 02/16/2018 01:53 PM   
I see no errors with the network itself. You will probably want everything running at full speed (unless you are profiling for some other mode). If you run "dmesg --follow" do you see the interface going up and down during the test? I didn't see a note of that on the dmesg, but this is something which would probably occur only during the test and I don't know if the dmesg was from after the test or before the test.
I see no errors with the network itself. You will probably want everything running at full speed (unless you are profiling for some other mode).

If you run "dmesg --follow" do you see the interface going up and down during the test? I didn't see a note of that on the dmesg, but this is something which would probably occur only during the test and I don't know if the dmesg was from after the test or before the test.

#8
Posted 02/16/2018 03:06 PM   
The other two Jetson don't have sufficient cooling at the moment, that's why I kept them at the slower speed yesterday. The dmesg output attached to my post above is basically the ouput over the complete time I ran the tests. I just skipped the first 7 s of the startup, because the forum didn't like posts that long. But I copy pasted it while writing my post. The point that there is no indication whatsoever from the network stack itself is what puzzles m most. In my experience a bad cable, bogus negotiations (duplex, speed etc.) between the NICs involved or miss-configurations almost always produce transmission errors on packet level. The only indication that something is off, is the extreme high retry counts from iperf3. And that seems to be an indication of something being wrong between the network stack and userland (iperf3). Two hosts, connected directly to the same switch with no further network load, running clean interfaces and nothing than housekeeping besides the iperf3 almost always generate around 950MBits/sec with no retries on a GigE network. I would really like to have something in the range of 900MBits/Sec to 950Mbits/sec on the Jetsons. The recorded speed won't necessarily brake my application, but I fear the existence of a bigger problem buried deeper in the network, that will come back to haunt me at the most inappropriate moment. So I would really prefer if I could get down to the root of what is happening...
The other two Jetson don't have sufficient cooling at the moment, that's why I kept them at the slower speed yesterday.

The dmesg output attached to my post above is basically the ouput over the complete time I ran the tests. I just skipped the first 7 s of the startup, because the forum didn't like posts that long. But I copy pasted it while writing my post.

The point that there is no indication whatsoever from the network stack itself is what puzzles m most. In my experience a bad cable, bogus negotiations (duplex, speed etc.) between the NICs involved or miss-configurations almost always produce transmission errors on packet level. The only indication that something is off, is the extreme high retry counts from iperf3. And that seems to be an indication of something being wrong between the network stack and userland (iperf3).

Two hosts, connected directly to the same switch with no further network load, running clean interfaces and nothing than housekeeping besides the iperf3 almost always generate around 950MBits/sec with no retries on a GigE network.

I would really like to have something in the range of 900MBits/Sec to 950Mbits/sec on the Jetsons. The recorded speed won't necessarily brake my application, but I fear the existence of a bigger problem buried deeper in the network, that will come back to haunt me at the most inappropriate moment.

So I would really prefer if I could get down to the root of what is happening...

#9
Posted 02/16/2018 03:50 PM   
Normally I wouldn't think a local network would have retries, but I also don't know how iperf generates this...i.e., whether this is a TCP retry or if it is something part of iperf itself. Since there are no ifconfig errors it isn't obvious. Certainly a retry from TCP would make me very curious, but if this is some iperf code on top of UDP it would drastically change the questions (I am not an iperf guru). FYI, adding a reliability layer on top of UDP at each end point which is exactly like the nagle algorithm in TCP does not result in the same behavior as TCP...in TCP each hop on the route (including the switch) has an understanding of TCP and the rules of retransmit...for a UDP version intermediate hops will have no such knowledge. Perhaps if both ends are directly connected and have no switch in between reliability added to UDP would be equivalent to TCP/nagle. Does anyone here know whether iperf has its own retry mechanism, or if instead it is simply monitoring TCP retries? If it is TCP, then there are parameters which can be changed in "/proc" to test against. I am curious if you can post the output of "ethtool <interface>" on each of the involved interfaces? This might point out differences such as half-duplex/full-duplex. Also, do you know if everything in the tests involved were purely IPv4 and none of the hosts/clients are using IPv6 in the communications chain? I suspect a "traceroute" from [i]both[/i] ends would show purely IPv4 addresses, but I want to be sure.
Normally I wouldn't think a local network would have retries, but I also don't know how iperf generates this...i.e., whether this is a TCP retry or if it is something part of iperf itself. Since there are no ifconfig errors it isn't obvious. Certainly a retry from TCP would make me very curious, but if this is some iperf code on top of UDP it would drastically change the questions (I am not an iperf guru). FYI, adding a reliability layer on top of UDP at each end point which is exactly like the nagle algorithm in TCP does not result in the same behavior as TCP...in TCP each hop on the route (including the switch) has an understanding of TCP and the rules of retransmit...for a UDP version intermediate hops will have no such knowledge. Perhaps if both ends are directly connected and have no switch in between reliability added to UDP would be equivalent to TCP/nagle.

Does anyone here know whether iperf has its own retry mechanism, or if instead it is simply monitoring TCP retries? If it is TCP, then there are parameters which can be changed in "/proc" to test against.

I am curious if you can post the output of "ethtool <interface>" on each of the involved interfaces? This might point out differences such as half-duplex/full-duplex.

Also, do you know if everything in the tests involved were purely IPv4 and none of the hosts/clients are using IPv6 in the communications chain? I suspect a "traceroute" from both ends would show purely IPv4 addresses, but I want to be sure.

#10
Posted 02/16/2018 05:01 PM   
iperf3 uses tcp in the way I use it. You have to specifically switch to udp via the command-line if you want to use it. The retries are re-send tcp segments which get lost due to congestion or corruption (see [url]https://github.com/esnet/iperf/issues/343[/url]). It is purely IPv4, but I will test again nevertheless! Currently I am out of the office, but I will follow up as soon as I am back on the weekend.
iperf3 uses tcp in the way I use it. You have to specifically switch to udp via the command-line if you want to use it.

The retries are re-send tcp segments which get lost due to congestion or corruption (see https://github.com/esnet/iperf/issues/343).

It is purely IPv4, but I will test again nevertheless! Currently I am out of the office, but I will follow up as soon as I am back on the weekend.

#11
Posted 02/16/2018 05:18 PM   
If the issue was with corruption I think you'd see other errors. Congestion would not show up as an error. I'm thinking you may be running into memory buffer limits set in "/proc/sys/net/ipv4/tcp_mem". For information on files there see: [url]https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt[/url] On a terminal you can view as the perf test runs see how queue sizes are going up at moments when you think it might be doing a retransmit: [code]sudo -s watch -n 1 "ss -m -n | egrep '(tcp|Netid)'" exit[/code] If possible try to associate a particular line of the "ss" socket stat output to what the perf command is using. If it turns out that you are hitting memory constraints you might be able to increase the tcp_mem and at least lower the retries. I don't know of a better way to watch memory used by a particular socket, but it might still point to something which can be tuned.
If the issue was with corruption I think you'd see other errors. Congestion would not show up as an error.

I'm thinking you may be running into memory buffer limits set in "/proc/sys/net/ipv4/tcp_mem". For information on files there see:
https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

On a terminal you can view as the perf test runs see how queue sizes are going up at moments when you think it might be doing a retransmit:
sudo -s
watch -n 1 "ss -m -n | egrep '(tcp|Netid)'"
exit


If possible try to associate a particular line of the "ss" socket stat output to what the perf command is using. If it turns out that you are hitting memory constraints you might be able to increase the tcp_mem and at least lower the retries. I don't know of a better way to watch memory used by a particular socket, but it might still point to something which can be tuned.

#12
Posted 02/16/2018 07:50 PM   
Scroll To Top

Add Reply