Jetson TX1 strange network performance behaviour (still)
Hi! I encountered a lot very strange network performance behaviours with the TX1. To verify that it is not my personal setup a did the same with a TX1 on the TX1 development board and with a vanilla JetPack 3.1 installation on it. When I use the internal r8152 network card and iperf3 to measure the performance I get around 300MBits/sec when sending to the device but full around 900MBits/sec when receiving from the device. On the TX1: [code] nvidia@tegra-ubuntu:~$ iperf3 -s -p 12345 [/code] On any other working machine connected to the same GigE switch: Receiving from the TX1 [code] $iperf3 -c tegra-ubuntu.local -p 12345 -t 60 -i 10 -b 1G -R Connecting to host tegra-ubuntu.local, port 12345 Reverse mode, remote host tegra-ubuntu.local is sending [ 7] local fe80::10b9:e3af:1f20:c50a port 52937 connected to fe80::f236:19c3:65f4:d4c0 port 12345 [ ID] Interval Transfer Bitrate [ 7] 0.00-10.00 sec 1.06 GBytes 911 Mbits/sec [ 7] 10.00-20.00 sec 1.08 GBytes 928 Mbits/sec [ 7] 20.00-30.00 sec 1.08 GBytes 928 Mbits/sec [ 7] 30.00-40.00 sec 1.08 GBytes 928 Mbits/sec [ 7] 40.00-50.00 sec 1.08 GBytes 928 Mbits/sec [ 7] 50.00-60.00 sec 1.08 GBytes 928 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 7] 0.00-60.00 sec 6.46 GBytes 925 Mbits/sec 0 sender [ 7] 0.00-60.00 sec 6.46 GBytes 925 Mbits/sec receiver iperf Done. [/code] Sending to the TX1: [code] $iperf3 -c tegra-ubuntu.local -p 12345 -t 60 -i 10 -b 1G Connecting to host tegra-ubuntu.local, port 12345 [ 7] local fe80::10b9:e3af:1f20:c50a port 52944 connected to fe80::f236:19c3:65f4:d4c0 port 12345 [ ID] Interval Transfer Bitrate [ 7] 0.00-10.00 sec 399 MBytes 335 Mbits/sec [ 7] 10.00-20.00 sec 431 MBytes 361 Mbits/sec [ 7] 20.00-30.00 sec 478 MBytes 401 Mbits/sec [ 7] 30.00-40.00 sec 408 MBytes 342 Mbits/sec [ 7] 40.00-50.00 sec 399 MBytes 335 Mbits/sec [ 7] 50.00-60.00 sec 517 MBytes 434 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 7] 0.00-60.00 sec 2.57 GBytes 368 Mbits/sec sender [ 7] 0.00-60.00 sec 2.57 GBytes 368 Mbits/sec receiver iperf Done. [/code] I found this thread [url]https://devtalk.nvidia.com/default/topic/979635/jetson-tx1/ethernet-speed-increases-when-micro-usb-2-0-connector-is-connected/[/url] which has similar problems, but the apperent fix at the end of that thread is not applicable anymore due to changes in the kernel over the last year. A solid network performance is very important for the application of our company, so I need reliable speeds above 900MBits/sec. Anyone had the same problem and solved it for a JetPack 3.1 / L4T 28.1 installation?
Hi!

I encountered a lot very strange network performance behaviours with the TX1. To verify that it is not my personal setup a did the same with a TX1 on the TX1 development board and with a vanilla JetPack 3.1 installation on it.

When I use the internal r8152 network card and iperf3 to measure the performance I get around 300MBits/sec when sending to the device but full around 900MBits/sec when receiving from the device.

On the TX1:
nvidia@tegra-ubuntu:~$ iperf3 -s -p 12345


On any other working machine connected to the same GigE switch:
Receiving from the TX1
$iperf3 -c tegra-ubuntu.local -p 12345 -t 60 -i 10 -b 1G -R
Connecting to host tegra-ubuntu.local, port 12345
Reverse mode, remote host tegra-ubuntu.local is sending
[ 7] local fe80::10b9:e3af:1f20:c50a port 52937 connected to fe80::f236:19c3:65f4:d4c0 port 12345
[ ID] Interval Transfer Bitrate
[ 7] 0.00-10.00 sec 1.06 GBytes 911 Mbits/sec
[ 7] 10.00-20.00 sec 1.08 GBytes 928 Mbits/sec
[ 7] 20.00-30.00 sec 1.08 GBytes 928 Mbits/sec
[ 7] 30.00-40.00 sec 1.08 GBytes 928 Mbits/sec
[ 7] 40.00-50.00 sec 1.08 GBytes 928 Mbits/sec
[ 7] 50.00-60.00 sec 1.08 GBytes 928 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 7] 0.00-60.00 sec 6.46 GBytes 925 Mbits/sec 0 sender
[ 7] 0.00-60.00 sec 6.46 GBytes 925 Mbits/sec receiver

iperf Done.


Sending to the TX1:
$iperf3 -c tegra-ubuntu.local -p 12345 -t 60 -i 10 -b 1G
Connecting to host tegra-ubuntu.local, port 12345
[ 7] local fe80::10b9:e3af:1f20:c50a port 52944 connected to fe80::f236:19c3:65f4:d4c0 port 12345
[ ID] Interval Transfer Bitrate
[ 7] 0.00-10.00 sec 399 MBytes 335 Mbits/sec
[ 7] 10.00-20.00 sec 431 MBytes 361 Mbits/sec
[ 7] 20.00-30.00 sec 478 MBytes 401 Mbits/sec
[ 7] 30.00-40.00 sec 408 MBytes 342 Mbits/sec
[ 7] 40.00-50.00 sec 399 MBytes 335 Mbits/sec
[ 7] 50.00-60.00 sec 517 MBytes 434 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 7] 0.00-60.00 sec 2.57 GBytes 368 Mbits/sec sender
[ 7] 0.00-60.00 sec 2.57 GBytes 368 Mbits/sec receiver

iperf Done.


I found this thread https://devtalk.nvidia.com/default/topic/979635/jetson-tx1/ethernet-speed-increases-when-micro-usb-2-0-connector-is-connected/ which has similar problems, but the apperent fix at the end of that thread is not applicable anymore due to changes in the kernel over the last year.

A solid network performance is very important for the application of our company, so I need reliable speeds above 900MBits/sec.

Anyone had the same problem and solved it for a JetPack 3.1 / L4T 28.1 installation?

#1
Posted 02/14/2018 07:20 PM   
This probably won't matter, but you may want to first be certain that "ifconfig" from both sides reports MTU the same (each direction can have a different MTU). Also, if you have traffic going through a router (or expensive managed switch) it might behave differently depending on direction. Do be sure it is a switch and not a router the traffic goes through (or perhaps even direct...a cross over cable is needed if both ends don't auto switch with non-crossover...and of course one end would have to also be a DHCP server in that case unless static IP addresses are used).
This probably won't matter, but you may want to first be certain that "ifconfig" from both sides reports MTU the same (each direction can have a different MTU). Also, if you have traffic going through a router (or expensive managed switch) it might behave differently depending on direction. Do be sure it is a switch and not a router the traffic goes through (or perhaps even direct...a cross over cable is needed if both ends don't auto switch with non-crossover...and of course one end would have to also be a DHCP server in that case unless static IP addresses are used).

#2
Posted 02/14/2018 08:17 PM   
Thanks for the tips, but bogus MTU settings were my very first idea as well. The switch really shouldn't be the problem since the Jetson and the "other" computer are basically the only users on that particular switch, with an uplink to the dhcp server. So was very much with you on the first level of error hunt! Interestingly I have no errors on the ifconfig outputs, but if I have use an iperf3 version which reports retries they are very high (three digits) for the TX1-receiving case. I have currently no access to the systems, but I can post some examples tomorrow.
Thanks for the tips, but bogus MTU settings were my very first idea as well. The switch really shouldn't be the problem since the Jetson and the "other" computer are basically the only users on that particular switch, with an uplink to the dhcp server. So was very much with you on the first level of error hunt!

Interestingly I have no errors on the ifconfig outputs, but if I have use an iperf3 version which reports retries they are very high (three digits) for the TX1-receiving case.

I have currently no access to the systems, but I can post some examples tomorrow.

#3
Posted 02/14/2018 09:47 PM   
An ifconfig listing for both systems after a session where there were errors would be enlightening. Before you start the traffic which has the errors you might want to run "dmesg --follow" so you can see if the kernel announces anything during that same session.
An ifconfig listing for both systems after a session where there were errors would be enlightening. Before you start the traffic which has the errors you might want to run "dmesg --follow" so you can see if the kernel announces anything during that same session.

#4
Posted 02/14/2018 09:50 PM   
Today I did a full range of tests on the hardware available to me, which at the moment are: [list] [.]TX1 on a NVIDIA Development board running a clean Jetpack[/.] [.]TX1 on an Auvidea J120 running a custom 4.4.38 kernel and an ubuntu 16.04 base system[/.] [.]TX2 on an Auvidea J140 running a custom 4.4.38 kernel and an ubuntu 16.04 base system[/.] [.]Various PCs on ubuntu and one MacBook Pro[/.] [/list] During all the tests the computers involved were connected to the same gigabit ethernet switch with only infrastructure (DHCP server) coming in on one port. I also tested 3 different switches: [olist] [.]TP Link TL-SG105E (5 Port GigE)[/.] [.]CyberData 011236A Embedded (3 Port GigE)[/.] [.]Netgear ProSafe GS108 (8 Port GigE)[/.] [/olist] The bandwidth to and from the TX1 (no big difference between the Nvidia Dev board and the auvidea J120) varies but never reaches the expected 900MBits/sec in both directions. Interestingly there were no kernel messages related to the network whatsoever. Two good examples are between two TX1 and the TX1 and the TX2. In the two outputs below, the server was always running on the other side and the client on the TX1 on the Nvidia Development board (ubuntu-tegra). At the moment I am a bit puzzled, since the only HW option left are cables. I ordered a set of highend Cat 6a cables, which hopefully will arrive befor the weekend, but I really think there is something else wrong here, since the TX behaves pretty well on a cable and port the TX1s make funny things. I have the feeling that there is something wrong down the line of the USB ethernet conversion. Otherwise I can't explain the really huge amount of retries as well. Any ideas from anyone?!? [b]Connected to TX1 on Auvidea J120:[/b] [code] X1J120 Before: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:5a:ec:56 inet addr:192.168.1.29 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::204:4bff:fe5a:ec56/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4138 errors:0 dropped:0 overruns:0 frame:0 TX packets:355 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:690435 (690.4 KB) TX bytes:117975 (117.9 KB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:162 errors:0 dropped:0 overruns:0 frame:0 TX packets:162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB) TX1DEV before: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84 inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:13023348 errors:0 dropped:0 overruns:0 frame:0 TX packets:4775656 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:18478442603 (18.4 GB) TX bytes:8620983329 (8.6 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:234 errors:0 dropped:0 overruns:0 frame:0 TX packets:234 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB) wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Results: $ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G Connecting to host tegra-ubuntu, port 12345 [ 4] local 192.168.1.29 port 41176 connected to 192.168.1.26 port 12345 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-10.00 sec 874 MBytes 733 Mbits/sec 297 215 KBytes [ 4] 10.00-20.00 sec 887 MBytes 744 Mbits/sec 211 283 KBytes [ 4] 20.00-30.00 sec 882 MBytes 740 Mbits/sec 267 262 KBytes [ 4] 30.00-40.00 sec 875 MBytes 734 Mbits/sec 258 249 KBytes [ 4] 40.00-50.00 sec 877 MBytes 736 Mbits/sec 278 235 KBytes [ 4] 50.00-60.00 sec 875 MBytes 734 Mbits/sec 300 256 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-60.00 sec 5.15 GBytes 737 Mbits/sec 1611 sender [ 4] 0.00-60.00 sec 5.14 GBytes 737 Mbits/sec receiver iperf Done. $ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G -R Connecting to host tegra-ubuntu, port 12345 Reverse mode, remote host tegra-ubuntu is sending [ 4] local 192.168.1.29 port 41180 connected to 192.168.1.26 port 12345 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 799 MBytes 670 Mbits/sec [ 4] 10.00-20.00 sec 806 MBytes 677 Mbits/sec [ 4] 20.00-30.00 sec 814 MBytes 683 Mbits/sec [ 4] 30.00-40.00 sec 822 MBytes 689 Mbits/sec [ 4] 40.00-50.00 sec 821 MBytes 688 Mbits/sec [ 4] 50.00-60.00 sec 821 MBytes 689 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-60.00 sec 4.77 GBytes 683 Mbits/sec 5294 sender [ 4] 0.00-60.00 sec 4.77 GBytes 683 Mbits/sec receiver iperf Done. TX1J120 after: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84 inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:18627771 errors:0 dropped:0 overruns:0 frame:0 TX packets:7065839 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:24373140690 (24.3 GB) TX bytes:13900665875 (13.9 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:234 errors:0 dropped:0 overruns:0 frame:0 TX packets:234 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB) wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) TX1DEV After: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:5a:ec:56 inet addr:192.168.1.29 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::204:4bff:fe5a:ec56/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:5458206 errors:0 dropped:0 overruns:0 frame:0 TX packets:2169169 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:5481488008 (5.4 GB) TX bytes:5670394266 (5.6 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:162 errors:0 dropped:0 overruns:0 frame:0 TX packets:162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB) [/code] [b]Connected to TX2 on Auvidea J140 (realtek port):[/b] [code] TX2J140 before: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:8d:46:88 inet addr:192.168.1.37 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::204:4bff:fe8d:4688/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3662 errors:0 dropped:0 overruns:0 frame:0 TX packets:845 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:723092 (723.0 KB) TX bytes:171853 (171.8 KB) Interrupt:42 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:162 errors:0 dropped:0 overruns:0 frame:0 TX packets:162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB) TX1DEV before: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84 inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:8955188 errors:0 dropped:0 overruns:0 frame:0 TX packets:2601581 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:13159965433 (13.1 GB) TX bytes:1603308787 (1.6 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:234 errors:0 dropped:0 overruns:0 frame:0 TX packets:234 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB) wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Result: $ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G Connecting to host tegra-ubuntu, port 12345 [ 4] local 192.168.1.37 port 40746 connected to 192.168.1.26 port 12345 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-10.00 sec 806 MBytes 676 Mbits/sec 538 253 KBytes [ 4] 10.00-20.00 sec 811 MBytes 681 Mbits/sec 587 276 KBytes [ 4] 20.00-30.00 sec 804 MBytes 675 Mbits/sec 474 284 KBytes [ 4] 30.00-40.00 sec 817 MBytes 685 Mbits/sec 472 270 KBytes [ 4] 40.00-50.00 sec 788 MBytes 661 Mbits/sec 458 264 KBytes [ 4] 50.00-60.00 sec 791 MBytes 663 Mbits/sec 505 281 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-60.00 sec 4.70 GBytes 673 Mbits/sec 3034 sender [ 4] 0.00-60.00 sec 4.70 GBytes 673 Mbits/sec receiver iperf Done. $ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G -R Connecting to host tegra-ubuntu, port 12345 Reverse mode, remote host tegra-ubuntu is sending [ 4] local 192.168.1.37 port 40750 connected to 192.168.1.26 port 12345 [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 1.05 GBytes 903 Mbits/sec [ 4] 10.00-20.00 sec 1.06 GBytes 908 Mbits/sec [ 4] 20.00-30.00 sec 1.04 GBytes 896 Mbits/sec [ 4] 30.00-40.00 sec 1.07 GBytes 918 Mbits/sec [ 4] 40.00-50.00 sec 1.10 GBytes 941 Mbits/sec [ 4] 50.00-60.00 sec 1.08 GBytes 932 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-60.00 sec 6.40 GBytes 917 Mbits/sec 360 sender [ 4] 0.00-60.00 sec 6.40 GBytes 916 Mbits/sec receiver iperf Done. TX2J140 after: $ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:8d:46:88 inet addr:192.168.1.37 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::204:4bff:fe8d:4688/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:6491798 errors:0 dropped:0 overruns:0 frame:0 TX packets:691697 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:7302589896 (7.3 GB) TX bytes:10984071550 (10.9 GB) Interrupt:42 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:162 errors:0 dropped:0 overruns:0 frame:0 TX packets:162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB) TX1DEV after: /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84 inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:13023329 errors:0 dropped:0 overruns:0 frame:0 TX packets:4775648 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:18478440579 (18.4 GB) TX bytes:8620981145 (8.6 GB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:234 errors:0 dropped:0 overruns:0 frame:0 TX packets:234 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB) wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) [/code] [b]dmesg output[/b] [code] (...) [ 6.710355] systemd[1]: Created slice User and Session Slice. [ 6.718499] systemd[1]: Listening on Syslog Socket. [ 6.729052] systemd[1]: Listening on udev Control Socket. [ 6.736717] systemd[1]: Reached target User and Group Name Lookups. [ 6.745338] systemd[1]: Listening on Journal Audit Socket. [ 6.753056] systemd[1]: Listening on udev Kernel Socket. [ 6.760560] systemd[1]: Listening on Journal Socket (/dev/log). [ 6.768643] systemd[1]: Listening on LVM2 poll daemon socket. [ 6.776543] systemd[1]: Created slice System Slice. [ 6.783477] systemd[1]: Reached target Slices. [ 6.789878] systemd[1]: Reached target Swap. [ 6.796030] systemd[1]: Reached target Encrypted Volumes. [ 6.803360] systemd[1]: Listening on /dev/initctl Compatibility Named Pipe. [ 6.812234] systemd[1]: Listening on LVM2 metadata daemon socket. [ 6.820318] systemd[1]: Created slice system-serial\x2dgetty.slice. [ 6.832016] systemd[1]: Listening on Journal Socket. [ 6.839530] systemd[1]: Started Braille Device Support. [ 6.847829] systemd[1]: Starting Create list of required static device nodes for the current kernel... [ 6.853306] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 6.868239] systemd[1]: Starting Journal Service... [ 6.879745] systemd[1]: Mounting Debug File System... [ 6.889222] systemd[1]: Starting Remount Root and Kernel File Systems... [ 6.898194] systemd[1]: Reached target Remote File Systems (Pre). [ 6.906746] systemd[1]: Reached target Remote File Systems. [ 6.914497] systemd[1]: Listening on Device-mapper event daemon FIFOs. [ 6.923956] systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling... [ 6.941051] systemd[1]: Started Forward Password Requests to Wall Directory Watch. [ 6.952485] systemd[1]: Starting Set console keymap... [ 6.968416] systemd[1]: Starting Load Kernel Modules... [ 6.977348] systemd[1]: Started Create list of required static device nodes for the current kernel. [ 6.989988] systemd[1]: Started Remount Root and Kernel File Systems. [ 7.014442] systemd[1]: Starting udev Coldplug all Devices... [ 7.023545] systemd[1]: Starting Load/Save Random Seed... [ 7.032354] systemd[1]: Starting Create Static Device Nodes in /dev... [ 7.045984] systemd[1]: Mounted Debug File System. [ 7.055938] systemd[1]: Started Load Kernel Modules. [ 7.065118] systemd[1]: Started Load/Save Random Seed. [ 7.073213] systemd[1]: Started Journal Service. [ 7.135448] systemd-journald[207]: Received request to flush runtime journal from PID 1 [ 7.283307] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 7.368054] xhci-tegra 70090000.xusb: cannot find firmware....retry after 1 second [ 7.619213] dhd_module_init in [ 7.619407] found wifi platform device bcmdhd_wlan [ 7.620444] Power-up adapter 'DHD generic adapter' [ 7.620463] wifi_platform_set_power = 1 [ 7.705613] random: nonblocking pool is initialized [ 7.717811] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 7.727973] tegra-pcie 1003000.pcie-controller: link 0 down, ignoring [ 7.823313] wifi_platform_bus_enumerate device present 1 [ 7.857626] wifi_platform_bus_enumerate device present 0 [ 7.881198] F1 signature read @0x18000000=0x17214354 [ 7.939060] F1 signature OK, socitype:0x1 chip:0x4354 rev:0x1 pkg:0x2 [ 7.939789] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000 [ 7.939864] wifi_platform_prealloc: failed to alloc static mem section 7 [ 7.939872] wifi_platform_get_mac_addr [ 7.953641] CFG80211-ERROR) wl_setup_wiphy : Registering Vendor80211 [ 7.955888] wl_create_event_handler(): thread:wl_event_handler:210 started [ 7.956003] CFG80211-ERROR) wl_event_handler : tsk Enter, tsk = 0xffffffc07b601a70 [ 7.963796] dhd_attach(): thread:dhd_watchdog_thread:213 started [ 7.963941] dhd_attach(): thread:dhd_dpc:21c started [ 7.963985] dhd_attach(): thread:dhd_rxf:21d started [ 7.963990] dhd_deferred_work_init: work queue initialized [ 7.964253] Dongle Host Driver, version 1.201.82 (r) Compiled in drivers/net/wireless/bcmdhd on Jul 20 2017 at 00:39:01 [ 7.964622] tegra_sysfs_register [ 7.964670] Register interface [wlan0] MAC: 00:04:4b:a1:dd:82 [ 7.964673] dhd_prot_ioctl : bus is down. we have nothing to do [ 7.965278] sdhci-tegra sdhci-tegra.1: Tuning already done, restoring the best tap value : 85 [ 7.966329] wifi_platform_set_power = 0 [ 8.148414] tegra-pcie 1003000.pcie-controller: link 1 down, retrying [ 8.391952] xhci-tegra 70090000.xusb: Firmware timestamp: 2016-11-24 02:31:08 UTC, Version: 50.18 release [ 8.411542] xhci-tegra 70090000.xusb: xHCI Host Controller [ 8.419375] xhci-tegra 70090000.xusb: new USB bus registered, assigned bus number 1 [ 8.430631] xhci-tegra 70090000.xusb: hcc params 0x0184f525 hci version 0x100 quirks 0x00010810 [ 8.441958] xhci-tegra 70090000.xusb: irq 319, io mem 0x70090000 [ 8.450252] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 [ 8.450256] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 8.450259] usb usb1: Product: xHCI Host Controller [ 8.450261] usb usb1: Manufacturer: Linux 4.4.38-tegra xhci-hcd [ 8.450263] usb usb1: SerialNumber: 70090000.xusb [ 8.450693] hub 1-0:1.0: USB hub found [ 8.450718] hub 1-0:1.0: 5 ports detected [ 8.479353] xhci-tegra 70090000.xusb: xHCI Host Controller [ 8.479362] xhci-tegra 70090000.xusb: new USB bus registered, assigned bus number 2 [ 8.479532] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003 [ 8.479535] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 8.479538] usb usb2: Product: xHCI Host Controller [ 8.479540] usb usb2: Manufacturer: Linux 4.4.38-tegra xhci-hcd [ 8.479542] usb usb2: SerialNumber: 70090000.xusb [ 8.484705] hub 2-0:1.0: USB hub found [ 8.484742] hub 2-0:1.0: 4 ports detected [ 8.549455] tegra-pcie 1003000.pcie-controller: link 1 down, retrying [ 8.795807] usb 2-1: new SuperSpeed USB device number 2 using xhci-tegra [ 8.812575] usb 2-1: New USB device found, idVendor=0955, idProduct=09ff [ 8.812579] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6 [ 8.812581] usb 2-1: Product: USB 10/100/1000 LAN [ 8.812584] usb 2-1: Manufacturer: Nvidia [ 8.812586] usb 2-1: SerialNumber: 000001000000 [ 8.813257] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6 [ 8.815503] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6 [ 8.816048] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6 [ 8.848965] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 8.849070] Dongle Host Driver, version 1.201.82 (r) Compiled in drivers/net/wireless/bcmdhd on Jul 20 2017 at 00:39:01 [ 8.849072] wl_android_wifi_on in [ 8.849075] wifi_platform_set_power = 1 [ 8.928225] usb 2-1: reset SuperSpeed USB device number 2 using xhci-tegra [ 8.961316] tegra-pcie 1003000.pcie-controller: link 1 down, retrying [ 8.971314] tegra-pcie 1003000.pcie-controller: link 1 down, ignoring [ 8.978718] tegra-pcie 1003000.pcie-controller: PCIE: no end points detected [ 8.987556] tegra-pcie 1003000.pcie-controller: PCIE: Disable power rails [ 9.120684] mmc1: queuing unknown CIS tuple 0x80 (5 bytes) [ 9.206034] sdhci-tegra sdhci-tegra.1: Tuning already done, restoring the best tap value : 85 [ 9.219304] F1 signature read @0x18000000=0x17214354 [ 9.229684] F1 signature OK, socitype:0x1 chip:0x4354 rev:0x1 pkg:0x2 [ 9.236662] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000 [ 9.295782] dhdsdio_write_vars: Download, Upload and compare of NVRAM succeeded. [ 9.344664] dhd_bus_init: enable 0x06, ready 0x06 (waited 0us) [ 9.351819] wifi_platform_get_mac_addr [ 9.361085] Firmware up: op_mode=0x0005, MAC=00:04:4b:a1:dd:82 [ 9.372653] dhd_preinit_ioctls pspretend_threshold for HostAPD failed -23 [ 9.384015] Firmware version = wl0: Sep 14 2016 11:38:27 version 7.35.221.18 (r657725) FWID 01-9001dfb5 [ 9.395888] dhd_interworking_enable: failed to set WNM info, ret=-23 [ 9.402439] tegra_sysfs_on [ 9.463800] r8152 2-1:1.0 eth0: v2.03.3 (2015/01/29) [ 9.469156] r8152 2-1:1.0 eth0: This product is covered by one or more of the following patents: US6,570,884, US6,115,776, and US6,327,625. [ 9.523478] CFGP2P-ERROR) wl_cfgp2p_add_p2p_disc_if : P2P interface registered [ 9.546776] WLC_E_IF: NO_IF set, event Ignored [ 9.820592] cfg80211: World regulatory domain updated: [ 9.829773] cfg80211: DFS Master region: unset [ 9.834176] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time) [ 9.843989] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A) [ 9.852019] cfg80211: (2457000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A) [ 9.860035] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm), (N/A) [ 9.868059] cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (N/A) [ 9.877550] cfg80211: (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (0 s) [ 9.887057] cfg80211: (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2000 mBm), (0 s) [ 9.895167] cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 2000 mBm), (N/A) [ 9.903182] cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 0 mBm), (N/A) [ 10.671092] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 10.763495] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 10.932103] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 10.984605] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.018068] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.063481] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.071824] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.080787] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.120737] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.129083] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.148850] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 11.186197] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz [ 13.637089] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 14.549199] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 52.772777] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead. [ 54.745951] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 61.314019] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 68.172535] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 69.313281] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 78.228185] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 79.573457] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 97.844008] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [ 127.032820] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000 [/code]
Today I did a full range of tests on the hardware available to me, which at the moment are:
  • TX1 on a NVIDIA Development board running a clean Jetpack
  • TX1 on an Auvidea J120 running a custom 4.4.38 kernel and an ubuntu 16.04 base system
  • TX2 on an Auvidea J140 running a custom 4.4.38 kernel and an ubuntu 16.04 base system
  • Various PCs on ubuntu and one MacBook Pro


During all the tests the computers involved were connected to the same gigabit ethernet switch with only infrastructure (DHCP server) coming in on one port. I also tested 3 different switches:

  1. TP Link TL-SG105E (5 Port GigE)
  2. CyberData 011236A Embedded (3 Port GigE)
  3. Netgear ProSafe GS108 (8 Port GigE)


The bandwidth to and from the TX1 (no big difference between the Nvidia Dev board and the auvidea J120) varies but never reaches the expected 900MBits/sec in both directions.

Interestingly there were no kernel messages related to the network whatsoever.

Two good examples are between two TX1 and the TX1 and the TX2. In the two outputs below, the server was always running on the other side and the client on the TX1 on the Nvidia Development board (ubuntu-tegra).

At the moment I am a bit puzzled, since the only HW option left are cables. I ordered a set of highend Cat 6a cables, which hopefully will arrive befor the weekend, but I really think there is something else wrong here, since the TX behaves pretty well on a cable and port the TX1s make funny things. I have the feeling that there is something wrong down the line of the USB ethernet conversion. Otherwise I can't explain the really huge amount of retries as well.

Any ideas from anyone?!?

Connected to TX1 on Auvidea J120:
X1J120 Before:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:5a:ec:56
inet addr:192.168.1.29 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::204:4bff:fe5a:ec56/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:4138 errors:0 dropped:0 overruns:0 frame:0
TX packets:355 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:690435 (690.4 KB) TX bytes:117975 (117.9 KB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:162 errors:0 dropped:0 overruns:0 frame:0
TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB)

TX1DEV before:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84
inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:13023348 errors:0 dropped:0 overruns:0 frame:0
TX packets:4775656 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:18478442603 (18.4 GB) TX bytes:8620983329 (8.6 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:234 errors:0 dropped:0 overruns:0 frame:0
TX packets:234 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB)

wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

Results:

$ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G
Connecting to host tegra-ubuntu, port 12345
[ 4] local 192.168.1.29 port 41176 connected to 192.168.1.26 port 12345
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-10.00 sec 874 MBytes 733 Mbits/sec 297 215 KBytes
[ 4] 10.00-20.00 sec 887 MBytes 744 Mbits/sec 211 283 KBytes
[ 4] 20.00-30.00 sec 882 MBytes 740 Mbits/sec 267 262 KBytes
[ 4] 30.00-40.00 sec 875 MBytes 734 Mbits/sec 258 249 KBytes
[ 4] 40.00-50.00 sec 877 MBytes 736 Mbits/sec 278 235 KBytes
[ 4] 50.00-60.00 sec 875 MBytes 734 Mbits/sec 300 256 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 5.15 GBytes 737 Mbits/sec 1611 sender
[ 4] 0.00-60.00 sec 5.14 GBytes 737 Mbits/sec receiver

iperf Done.
$ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G -R
Connecting to host tegra-ubuntu, port 12345
Reverse mode, remote host tegra-ubuntu is sending
[ 4] local 192.168.1.29 port 41180 connected to 192.168.1.26 port 12345
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 799 MBytes 670 Mbits/sec
[ 4] 10.00-20.00 sec 806 MBytes 677 Mbits/sec
[ 4] 20.00-30.00 sec 814 MBytes 683 Mbits/sec
[ 4] 30.00-40.00 sec 822 MBytes 689 Mbits/sec
[ 4] 40.00-50.00 sec 821 MBytes 688 Mbits/sec
[ 4] 50.00-60.00 sec 821 MBytes 689 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 4.77 GBytes 683 Mbits/sec 5294 sender
[ 4] 0.00-60.00 sec 4.77 GBytes 683 Mbits/sec receiver

iperf Done.

TX1J120 after:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84
inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:18627771 errors:0 dropped:0 overruns:0 frame:0
TX packets:7065839 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:24373140690 (24.3 GB) TX bytes:13900665875 (13.9 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:234 errors:0 dropped:0 overruns:0 frame:0
TX packets:234 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB)

wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)


TX1DEV After:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:5a:ec:56
inet addr:192.168.1.29 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::204:4bff:fe5a:ec56/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:5458206 errors:0 dropped:0 overruns:0 frame:0
TX packets:2169169 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:5481488008 (5.4 GB) TX bytes:5670394266 (5.6 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:162 errors:0 dropped:0 overruns:0 frame:0
TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB)


Connected to TX2 on Auvidea J140 (realtek port):
TX2J140 before:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:8d:46:88
inet addr:192.168.1.37 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::204:4bff:fe8d:4688/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3662 errors:0 dropped:0 overruns:0 frame:0
TX packets:845 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:723092 (723.0 KB) TX bytes:171853 (171.8 KB)
Interrupt:42

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:162 errors:0 dropped:0 overruns:0 frame:0
TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB)

TX1DEV before:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84
inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8955188 errors:0 dropped:0 overruns:0 frame:0
TX packets:2601581 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:13159965433 (13.1 GB) TX bytes:1603308787 (1.6 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:234 errors:0 dropped:0 overruns:0 frame:0
TX packets:234 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB)

wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

Result:

$ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G
Connecting to host tegra-ubuntu, port 12345
[ 4] local 192.168.1.37 port 40746 connected to 192.168.1.26 port 12345
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-10.00 sec 806 MBytes 676 Mbits/sec 538 253 KBytes
[ 4] 10.00-20.00 sec 811 MBytes 681 Mbits/sec 587 276 KBytes
[ 4] 20.00-30.00 sec 804 MBytes 675 Mbits/sec 474 284 KBytes
[ 4] 30.00-40.00 sec 817 MBytes 685 Mbits/sec 472 270 KBytes
[ 4] 40.00-50.00 sec 788 MBytes 661 Mbits/sec 458 264 KBytes
[ 4] 50.00-60.00 sec 791 MBytes 663 Mbits/sec 505 281 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 4.70 GBytes 673 Mbits/sec 3034 sender
[ 4] 0.00-60.00 sec 4.70 GBytes 673 Mbits/sec receiver

iperf Done.
$ iperf3 -c tegra-ubuntu -p 12345 -i 10 -t 60 -b 1G -R
Connecting to host tegra-ubuntu, port 12345
Reverse mode, remote host tegra-ubuntu is sending
[ 4] local 192.168.1.37 port 40750 connected to 192.168.1.26 port 12345
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 1.05 GBytes 903 Mbits/sec
[ 4] 10.00-20.00 sec 1.06 GBytes 908 Mbits/sec
[ 4] 20.00-30.00 sec 1.04 GBytes 896 Mbits/sec
[ 4] 30.00-40.00 sec 1.07 GBytes 918 Mbits/sec
[ 4] 40.00-50.00 sec 1.10 GBytes 941 Mbits/sec
[ 4] 50.00-60.00 sec 1.08 GBytes 932 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 6.40 GBytes 917 Mbits/sec 360 sender
[ 4] 0.00-60.00 sec 6.40 GBytes 916 Mbits/sec receiver

iperf Done.

TX2J140 after:

$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:8d:46:88
inet addr:192.168.1.37 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::204:4bff:fe8d:4688/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6491798 errors:0 dropped:0 overruns:0 frame:0
TX packets:691697 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:7302589896 (7.3 GB) TX bytes:10984071550 (10.9 GB)
Interrupt:42

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:162 errors:0 dropped:0 overruns:0 frame:0
TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:11938 (11.9 KB) TX bytes:11938 (11.9 KB)

TX1DEV after:

/sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:84
inet addr:192.168.1.26 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::68b7:a7a3:435a:38fd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:13023329 errors:0 dropped:0 overruns:0 frame:0
TX packets:4775648 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:18478440579 (18.4 GB) TX bytes:8620981145 (8.6 GB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:234 errors:0 dropped:0 overruns:0 frame:0
TX packets:234 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:17815 (17.8 KB) TX bytes:17815 (17.8 KB)

wlan0 Link encap:Ethernet HWaddr 00:04:4b:a1:dd:82
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)



dmesg output

(...)
[ 6.710355] systemd[1]: Created slice User and Session Slice.
[ 6.718499] systemd[1]: Listening on Syslog Socket.
[ 6.729052] systemd[1]: Listening on udev Control Socket.
[ 6.736717] systemd[1]: Reached target User and Group Name Lookups.
[ 6.745338] systemd[1]: Listening on Journal Audit Socket.
[ 6.753056] systemd[1]: Listening on udev Kernel Socket.
[ 6.760560] systemd[1]: Listening on Journal Socket (/dev/log).
[ 6.768643] systemd[1]: Listening on LVM2 poll daemon socket.
[ 6.776543] systemd[1]: Created slice System Slice.
[ 6.783477] systemd[1]: Reached target Slices.
[ 6.789878] systemd[1]: Reached target Swap.
[ 6.796030] systemd[1]: Reached target Encrypted Volumes.
[ 6.803360] systemd[1]: Listening on /dev/initctl Compatibility Named Pipe.
[ 6.812234] systemd[1]: Listening on LVM2 metadata daemon socket.
[ 6.820318] systemd[1]: Created slice system-serial\x2dgetty.slice.
[ 6.832016] systemd[1]: Listening on Journal Socket.
[ 6.839530] systemd[1]: Started Braille Device Support.
[ 6.847829] systemd[1]: Starting Create list of required static device nodes for the current kernel...
[ 6.853306] tegra-pcie 1003000.pcie-controller: link 0 down, retrying
[ 6.868239] systemd[1]: Starting Journal Service...
[ 6.879745] systemd[1]: Mounting Debug File System...
[ 6.889222] systemd[1]: Starting Remount Root and Kernel File Systems...
[ 6.898194] systemd[1]: Reached target Remote File Systems (Pre).
[ 6.906746] systemd[1]: Reached target Remote File Systems.
[ 6.914497] systemd[1]: Listening on Device-mapper event daemon FIFOs.
[ 6.923956] systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
[ 6.941051] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[ 6.952485] systemd[1]: Starting Set console keymap...
[ 6.968416] systemd[1]: Starting Load Kernel Modules...
[ 6.977348] systemd[1]: Started Create list of required static device nodes for the current kernel.
[ 6.989988] systemd[1]: Started Remount Root and Kernel File Systems.
[ 7.014442] systemd[1]: Starting udev Coldplug all Devices...
[ 7.023545] systemd[1]: Starting Load/Save Random Seed...
[ 7.032354] systemd[1]: Starting Create Static Device Nodes in /dev...
[ 7.045984] systemd[1]: Mounted Debug File System.
[ 7.055938] systemd[1]: Started Load Kernel Modules.
[ 7.065118] systemd[1]: Started Load/Save Random Seed.
[ 7.073213] systemd[1]: Started Journal Service.
[ 7.135448] systemd-journald[207]: Received request to flush runtime journal from PID 1
[ 7.283307] tegra-pcie 1003000.pcie-controller: link 0 down, retrying
[ 7.368054] xhci-tegra 70090000.xusb: cannot find firmware....retry after 1 second
[ 7.619213] dhd_module_init in
[ 7.619407] found wifi platform device bcmdhd_wlan
[ 7.620444] Power-up adapter 'DHD generic adapter'
[ 7.620463] wifi_platform_set_power = 1
[ 7.705613] random: nonblocking pool is initialized
[ 7.717811] tegra-pcie 1003000.pcie-controller: link 0 down, retrying
[ 7.727973] tegra-pcie 1003000.pcie-controller: link 0 down, ignoring
[ 7.823313] wifi_platform_bus_enumerate device present 1
[ 7.857626] wifi_platform_bus_enumerate device present 0
[ 7.881198] F1 signature read @0x18000000=0x17214354
[ 7.939060] F1 signature OK, socitype:0x1 chip:0x4354 rev:0x1 pkg:0x2
[ 7.939789] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000
[ 7.939864] wifi_platform_prealloc: failed to alloc static mem section 7
[ 7.939872] wifi_platform_get_mac_addr
[ 7.953641] CFG80211-ERROR) wl_setup_wiphy : Registering Vendor80211
[ 7.955888] wl_create_event_handler(): thread:wl_event_handler:210 started
[ 7.956003] CFG80211-ERROR) wl_event_handler : tsk Enter, tsk = 0xffffffc07b601a70
[ 7.963796] dhd_attach(): thread:dhd_watchdog_thread:213 started
[ 7.963941] dhd_attach(): thread:dhd_dpc:21c started
[ 7.963985] dhd_attach(): thread:dhd_rxf:21d started
[ 7.963990] dhd_deferred_work_init: work queue initialized
[ 7.964253] Dongle Host Driver, version 1.201.82 (r)
Compiled in drivers/net/wireless/bcmdhd on Jul 20 2017 at 00:39:01
[ 7.964622] tegra_sysfs_register
[ 7.964670] Register interface [wlan0] MAC: 00:04:4b:a1:dd:82

[ 7.964673] dhd_prot_ioctl : bus is down. we have nothing to do
[ 7.965278] sdhci-tegra sdhci-tegra.1: Tuning already done, restoring the best tap value : 85
[ 7.966329] wifi_platform_set_power = 0
[ 8.148414] tegra-pcie 1003000.pcie-controller: link 1 down, retrying
[ 8.391952] xhci-tegra 70090000.xusb: Firmware timestamp: 2016-11-24 02:31:08 UTC, Version: 50.18 release
[ 8.411542] xhci-tegra 70090000.xusb: xHCI Host Controller
[ 8.419375] xhci-tegra 70090000.xusb: new USB bus registered, assigned bus number 1
[ 8.430631] xhci-tegra 70090000.xusb: hcc params 0x0184f525 hci version 0x100 quirks 0x00010810
[ 8.441958] xhci-tegra 70090000.xusb: irq 319, io mem 0x70090000
[ 8.450252] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[ 8.450256] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 8.450259] usb usb1: Product: xHCI Host Controller
[ 8.450261] usb usb1: Manufacturer: Linux 4.4.38-tegra xhci-hcd
[ 8.450263] usb usb1: SerialNumber: 70090000.xusb
[ 8.450693] hub 1-0:1.0: USB hub found
[ 8.450718] hub 1-0:1.0: 5 ports detected
[ 8.479353] xhci-tegra 70090000.xusb: xHCI Host Controller
[ 8.479362] xhci-tegra 70090000.xusb: new USB bus registered, assigned bus number 2
[ 8.479532] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003
[ 8.479535] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 8.479538] usb usb2: Product: xHCI Host Controller
[ 8.479540] usb usb2: Manufacturer: Linux 4.4.38-tegra xhci-hcd
[ 8.479542] usb usb2: SerialNumber: 70090000.xusb
[ 8.484705] hub 2-0:1.0: USB hub found
[ 8.484742] hub 2-0:1.0: 4 ports detected
[ 8.549455] tegra-pcie 1003000.pcie-controller: link 1 down, retrying
[ 8.795807] usb 2-1: new SuperSpeed USB device number 2 using xhci-tegra
[ 8.812575] usb 2-1: New USB device found, idVendor=0955, idProduct=09ff
[ 8.812579] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[ 8.812581] usb 2-1: Product: USB 10/100/1000 LAN
[ 8.812584] usb 2-1: Manufacturer: Nvidia
[ 8.812586] usb 2-1: SerialNumber: 000001000000
[ 8.813257] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6
[ 8.815503] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6
[ 8.816048] xhci-tegra 70090000.xusb: tegra_xhci_mbox_work mailbox command 6
[ 8.848965] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 8.849070]
Dongle Host Driver, version 1.201.82 (r)
Compiled in drivers/net/wireless/bcmdhd on Jul 20 2017 at 00:39:01
[ 8.849072] wl_android_wifi_on in
[ 8.849075] wifi_platform_set_power = 1
[ 8.928225] usb 2-1: reset SuperSpeed USB device number 2 using xhci-tegra
[ 8.961316] tegra-pcie 1003000.pcie-controller: link 1 down, retrying
[ 8.971314] tegra-pcie 1003000.pcie-controller: link 1 down, ignoring
[ 8.978718] tegra-pcie 1003000.pcie-controller: PCIE: no end points detected
[ 8.987556] tegra-pcie 1003000.pcie-controller: PCIE: Disable power rails
[ 9.120684] mmc1: queuing unknown CIS tuple 0x80 (5 bytes)
[ 9.206034] sdhci-tegra sdhci-tegra.1: Tuning already done, restoring the best tap value : 85
[ 9.219304] F1 signature read @0x18000000=0x17214354
[ 9.229684] F1 signature OK, socitype:0x1 chip:0x4354 rev:0x1 pkg:0x2
[ 9.236662] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000
[ 9.295782] dhdsdio_write_vars: Download, Upload and compare of NVRAM succeeded.
[ 9.344664] dhd_bus_init: enable 0x06, ready 0x06 (waited 0us)
[ 9.351819] wifi_platform_get_mac_addr
[ 9.361085] Firmware up: op_mode=0x0005, MAC=00:04:4b:a1:dd:82
[ 9.372653] dhd_preinit_ioctls pspretend_threshold for HostAPD failed -23
[ 9.384015] Firmware version = wl0: Sep 14 2016 11:38:27 version 7.35.221.18 (r657725) FWID 01-9001dfb5
[ 9.395888] dhd_interworking_enable: failed to set WNM info, ret=-23
[ 9.402439] tegra_sysfs_on
[ 9.463800] r8152 2-1:1.0 eth0: v2.03.3 (2015/01/29)
[ 9.469156] r8152 2-1:1.0 eth0: This product is covered by one or more of the following patents:
US6,570,884, US6,115,776, and US6,327,625.

[ 9.523478] CFGP2P-ERROR) wl_cfgp2p_add_p2p_disc_if : P2P interface registered
[ 9.546776] WLC_E_IF: NO_IF set, event Ignored
[ 9.820592] cfg80211: World regulatory domain updated:
[ 9.829773] cfg80211: DFS Master region: unset
[ 9.834176] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
[ 9.843989] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
[ 9.852019] cfg80211: (2457000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
[ 9.860035] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm), (N/A)
[ 9.868059] cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (N/A)
[ 9.877550] cfg80211: (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (0 s)
[ 9.887057] cfg80211: (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2000 mBm), (0 s)
[ 9.895167] cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 2000 mBm), (N/A)
[ 9.903182] cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 0 mBm), (N/A)
[ 10.671092] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 10.763495] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 10.932103] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 10.984605] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.018068] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.063481] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.071824] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.080787] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.120737] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.129083] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.148850] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 11.186197] Setting pll_a = 45158400 Hz clk_out = 11289600 Hz
[ 13.637089] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 14.549199] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 52.772777] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead.
[ 54.745951] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 61.314019] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 68.172535] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 69.313281] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 78.228185] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 79.573457] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 97.844008] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000
[ 127.032820] tegra_soctherm 700e2000.soctherm: soctherm: trip temperature -2147483647 forced to -127000

#5
Posted 02/15/2018 05:13 PM   
I see "ifconfig" with the "before". Is there any "ifconfig" which is after things have finished which show any kind of error? EDIT: Forgot to ask, have you run the "~ubuntu/jetson_clocks.sh" program to bring up core speed before testing?
I see "ifconfig" with the "before". Is there any "ifconfig" which is after things have finished which show any kind of error?

EDIT: Forgot to ask, have you run the "~ubuntu/jetson_clocks.sh" program to bring up core speed before testing?

#6
Posted 02/15/2018 09:04 PM   
[quote=""]I see "ifconfig" with the "before". Is there any "ifconfig" which is after things have finished which show any kind of error? [/quote] There is an ifconfig output from at the beginning and one after the test, directly behind the iperf3 output(s) for each of the two examples... [quote=""]EDIT: Forgot to ask, have you run the "~ubuntu/jetson_clocks.sh" program to bring up core speed before testing?[/quote] Ooops, I definitely should have written that! The primary receiver, Jetson TX1 on the Nvidia Development board, was running on full speed. The other two weren't.
said:I see "ifconfig" with the "before". Is there any "ifconfig" which is after things have finished which show any kind of error?

There is an ifconfig output from at the beginning and one after the test, directly behind the iperf3 output(s) for each of the two examples...
said:EDIT: Forgot to ask, have you run the "~ubuntu/jetson_clocks.sh" program to bring up core speed before testing?

Ooops, I definitely should have written that! The primary receiver, Jetson TX1 on the Nvidia Development board, was running on full speed. The other two weren't.

#7
Posted 02/16/2018 01:53 PM   
I see no errors with the network itself. You will probably want everything running at full speed (unless you are profiling for some other mode). If you run "dmesg --follow" do you see the interface going up and down during the test? I didn't see a note of that on the dmesg, but this is something which would probably occur only during the test and I don't know if the dmesg was from after the test or before the test.
I see no errors with the network itself. You will probably want everything running at full speed (unless you are profiling for some other mode).

If you run "dmesg --follow" do you see the interface going up and down during the test? I didn't see a note of that on the dmesg, but this is something which would probably occur only during the test and I don't know if the dmesg was from after the test or before the test.

#8
Posted 02/16/2018 03:06 PM   
The other two Jetson don't have sufficient cooling at the moment, that's why I kept them at the slower speed yesterday. The dmesg output attached to my post above is basically the ouput over the complete time I ran the tests. I just skipped the first 7 s of the startup, because the forum didn't like posts that long. But I copy pasted it while writing my post. The point that there is no indication whatsoever from the network stack itself is what puzzles m most. In my experience a bad cable, bogus negotiations (duplex, speed etc.) between the NICs involved or miss-configurations almost always produce transmission errors on packet level. The only indication that something is off, is the extreme high retry counts from iperf3. And that seems to be an indication of something being wrong between the network stack and userland (iperf3). Two hosts, connected directly to the same switch with no further network load, running clean interfaces and nothing than housekeeping besides the iperf3 almost always generate around 950MBits/sec with no retries on a GigE network. I would really like to have something in the range of 900MBits/Sec to 950Mbits/sec on the Jetsons. The recorded speed won't necessarily brake my application, but I fear the existence of a bigger problem buried deeper in the network, that will come back to haunt me at the most inappropriate moment. So I would really prefer if I could get down to the root of what is happening...
The other two Jetson don't have sufficient cooling at the moment, that's why I kept them at the slower speed yesterday.

The dmesg output attached to my post above is basically the ouput over the complete time I ran the tests. I just skipped the first 7 s of the startup, because the forum didn't like posts that long. But I copy pasted it while writing my post.

The point that there is no indication whatsoever from the network stack itself is what puzzles m most. In my experience a bad cable, bogus negotiations (duplex, speed etc.) between the NICs involved or miss-configurations almost always produce transmission errors on packet level. The only indication that something is off, is the extreme high retry counts from iperf3. And that seems to be an indication of something being wrong between the network stack and userland (iperf3).

Two hosts, connected directly to the same switch with no further network load, running clean interfaces and nothing than housekeeping besides the iperf3 almost always generate around 950MBits/sec with no retries on a GigE network.

I would really like to have something in the range of 900MBits/Sec to 950Mbits/sec on the Jetsons. The recorded speed won't necessarily brake my application, but I fear the existence of a bigger problem buried deeper in the network, that will come back to haunt me at the most inappropriate moment.

So I would really prefer if I could get down to the root of what is happening...

#9
Posted 02/16/2018 03:50 PM   
Normally I wouldn't think a local network would have retries, but I also don't know how iperf generates this...i.e., whether this is a TCP retry or if it is something part of iperf itself. Since there are no ifconfig errors it isn't obvious. Certainly a retry from TCP would make me very curious, but if this is some iperf code on top of UDP it would drastically change the questions (I am not an iperf guru). FYI, adding a reliability layer on top of UDP at each end point which is exactly like the nagle algorithm in TCP does not result in the same behavior as TCP...in TCP each hop on the route (including the switch) has an understanding of TCP and the rules of retransmit...for a UDP version intermediate hops will have no such knowledge. Perhaps if both ends are directly connected and have no switch in between reliability added to UDP would be equivalent to TCP/nagle. Does anyone here know whether iperf has its own retry mechanism, or if instead it is simply monitoring TCP retries? If it is TCP, then there are parameters which can be changed in "/proc" to test against. I am curious if you can post the output of "ethtool <interface>" on each of the involved interfaces? This might point out differences such as half-duplex/full-duplex. Also, do you know if everything in the tests involved were purely IPv4 and none of the hosts/clients are using IPv6 in the communications chain? I suspect a "traceroute" from [i]both[/i] ends would show purely IPv4 addresses, but I want to be sure.
Normally I wouldn't think a local network would have retries, but I also don't know how iperf generates this...i.e., whether this is a TCP retry or if it is something part of iperf itself. Since there are no ifconfig errors it isn't obvious. Certainly a retry from TCP would make me very curious, but if this is some iperf code on top of UDP it would drastically change the questions (I am not an iperf guru). FYI, adding a reliability layer on top of UDP at each end point which is exactly like the nagle algorithm in TCP does not result in the same behavior as TCP...in TCP each hop on the route (including the switch) has an understanding of TCP and the rules of retransmit...for a UDP version intermediate hops will have no such knowledge. Perhaps if both ends are directly connected and have no switch in between reliability added to UDP would be equivalent to TCP/nagle.

Does anyone here know whether iperf has its own retry mechanism, or if instead it is simply monitoring TCP retries? If it is TCP, then there are parameters which can be changed in "/proc" to test against.

I am curious if you can post the output of "ethtool <interface>" on each of the involved interfaces? This might point out differences such as half-duplex/full-duplex.

Also, do you know if everything in the tests involved were purely IPv4 and none of the hosts/clients are using IPv6 in the communications chain? I suspect a "traceroute" from both ends would show purely IPv4 addresses, but I want to be sure.

#10
Posted 02/16/2018 05:01 PM   
iperf3 uses tcp in the way I use it. You have to specifically switch to udp via the command-line if you want to use it. The retries are re-send tcp segments which get lost due to congestion or corruption (see [url]https://github.com/esnet/iperf/issues/343[/url]). It is purely IPv4, but I will test again nevertheless! Currently I am out of the office, but I will follow up as soon as I am back on the weekend.
iperf3 uses tcp in the way I use it. You have to specifically switch to udp via the command-line if you want to use it.

The retries are re-send tcp segments which get lost due to congestion or corruption (see https://github.com/esnet/iperf/issues/343).

It is purely IPv4, but I will test again nevertheless! Currently I am out of the office, but I will follow up as soon as I am back on the weekend.

#11
Posted 02/16/2018 05:18 PM   
If the issue was with corruption I think you'd see other errors. Congestion would not show up as an error. I'm thinking you may be running into memory buffer limits set in "/proc/sys/net/ipv4/tcp_mem". For information on files there see: [url]https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt[/url] On a terminal you can view as the perf test runs see how queue sizes are going up at moments when you think it might be doing a retransmit: [code]sudo -s watch -n 1 "ss -m -n | egrep '(tcp|Netid)'" exit[/code] If possible try to associate a particular line of the "ss" socket stat output to what the perf command is using. If it turns out that you are hitting memory constraints you might be able to increase the tcp_mem and at least lower the retries. I don't know of a better way to watch memory used by a particular socket, but it might still point to something which can be tuned.
If the issue was with corruption I think you'd see other errors. Congestion would not show up as an error.

I'm thinking you may be running into memory buffer limits set in "/proc/sys/net/ipv4/tcp_mem". For information on files there see:
https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

On a terminal you can view as the perf test runs see how queue sizes are going up at moments when you think it might be doing a retransmit:
sudo -s
watch -n 1 "ss -m -n | egrep '(tcp|Netid)'"
exit


If possible try to associate a particular line of the "ss" socket stat output to what the perf command is using. If it turns out that you are hitting memory constraints you might be able to increase the tcp_mem and at least lower the retries. I don't know of a better way to watch memory used by a particular socket, but it might still point to something which can be tuned.

#12
Posted 02/16/2018 07:50 PM   
I was out of office for a while and could only return to the problem right now. I had a look at an incoming iperf3 connection using the following command: [code] ss -imn -A tcp -o state established '( sport = 5201 )' [/code] It is somewhat hard to track, but since I [i]never[/i] have an iperf3 connection without retries, it is a bit hard to see what is happening. But the usage is going up on the receive site as soon as bigger retry numbers pass by. An example is: [code] Recv-Q Send-Q Local Address:Port Peer Address:Port 1448 0 ::ffff:192.168.1.86:5201 ::ffff:192.168.1.37:58002 skmem:(r2304,rb4311000,t28,tb87040,f722688,w0,o0,bl0) ts sack cubic wscale:7,7 rto:204 rtt:1.913/0.956 ato:40 mss:1448 cwnd :10 bytes_received:4760066269 segs_out:1655256 segs_in:3287342 send 60.6Mbps lastsnd:60036 lastack:112 pacing_rate 121.1Mbps rcv_rtt :7.5 rcv_space:682008 [/code] I significantly raised the memory available to the kernel for network tcp related stuff: [code] sysctl -w net.core.rmem_max=8388608 sysctl -w net.core.wmem_max=8388608 sysctl -w net.core.rmem_default=65536 sysctl -w net.core.wmem_default=65536 sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608' sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608' sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608' [/code] But there is no perceivable change, I still "only" have 600MBits/s with around 300 Retries for a 60s burst between two Jetsons.
I was out of office for a while and could only return to the problem right now.

I had a look at an incoming iperf3 connection using the following command:

ss -imn -A tcp -o state established '( sport = 5201 )'


It is somewhat hard to track, but since I never have an iperf3 connection without retries, it is a bit hard to see what is happening. But the usage is going up on the receive site as soon as bigger retry numbers pass by. An example is:

Recv-Q Send-Q Local Address:Port               Peer Address:Port
1448 0 ::ffff:192.168.1.86:5201 ::ffff:192.168.1.37:58002
skmem:(r2304,rb4311000,t28,tb87040,f722688,w0,o0,bl0) ts sack cubic wscale:7,7 rto:204 rtt:1.913/0.956 ato:40 mss:1448 cwnd
:10 bytes_received:4760066269 segs_out:1655256 segs_in:3287342 send 60.6Mbps lastsnd:60036 lastack:112 pacing_rate 121.1Mbps rcv_rtt
:7.5 rcv_space:682008


I significantly raised the memory available to the kernel for network tcp related stuff:
sysctl -w net.core.rmem_max=8388608
sysctl -w net.core.wmem_max=8388608
sysctl -w net.core.rmem_default=65536
sysctl -w net.core.wmem_default=65536
sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608'
sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608'
sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608'


But there is no perceivable change, I still "only" have 600MBits/s with around 300 Retries for a 60s burst between two Jetsons.

#13
Posted 03/05/2018 05:01 PM   
Sorry, this is really long :P I'm just adding some observations...[i]not necessarily anything in particular or in any order.[/i] Not all tests seem consistent. There is no real conclusion in this, but there are a lot of tests which may be surprising. It shows a few cases with the obvious possibilities are not the problem. My TX1 is running a fully updated L4T R28.1. I am running as root with "sudo -s". Some of my testing leads to questions I can't answer. As a very basic test I tried to see more basic results...I ran a flood ping while jetson_clocks.sh was at full speed ("sudo ping -f whereever") for 30 seconds from host to Jetson, and then from Jetson to host. Both resulted in no loss and right around 50000 packets. This does not involve TCP or UDP (it's ICMP) and is more closely related to the physical layer (and ARP) working correctly (without this being correct TCP and UDP would both inherit a faulty environment). No error, drop, overrun, collision, etc., ever occurs from flood ping. This tends to place issues with the higher level protocol stacks (hardware drivers work in lower levels on CPU0, software drivers implement stacks on any CPU core...stacks are limited by throughput of data feeding them or being consumed). I see that the "-b 1G" argument to iperf3 is not actually listed as valid in the man page, but "iperf3 --help" does show this. I tried iperf3 with "-b 1X" just to see if it showed an error, and it does not (I consider it a bug that an invalid argument is not an error). This calls into question whether the 1G bitrate is really doing as expected. I don't have a network analyzer so I couldn't say. Probably 1G is supported...but then again, perhaps it is supported just on arm64 or just on x86_64. I don't know. So I did something not yet done in order to isolate where limitations are coming from. I ran iperf3 as both server and client on the TX1 (which also implies the 1G speed is guaranteed to be the same for behavior of both client or server mode...there is no arm64/x86_64 difference possible). I used localhost address 127.0.0.1. This avoids going through the Realtek driver and hardware. This still uses protocol stacks (keep in mind ping doesn't care much about protocol stacks, UDP and TCP do). I had a throughput of approximately 999Mbits/sec, with no retries (this verifies "-b 1G" works as expected). This loopback interface can bypass CPU0 since no hardware drivers are involved. I'd say protocol stacks (iperf3 is using both UDP and TCP) and purely software side is at full performance (at least when limited to 1G speed...cutting out hardware seems to reach theoretical maximum). I tend to favor saying there is an issue with either the Realtek driver or the time the Realtek driver is having available for running (the latency before a hardware IRQ begins service or the time used during service of the IRQ is implied as the limitation). [i]It's hard to know without profiling, and I have no way to hardware profile.[/i] I then ran a flood ping on the TX1 to 127.0.0.1 for 30 seconds (a ping going to itself without touching the NIC). Approximately 900000 packets were serviced without loss. The network software, when [i]not[/i] going through network hardware, is about 18:1 faster. Next I ran a 30 second flood ping to the address [i]of the local NIC[/i] on the TX1 (both send and receive are serviced by the same NIC and driver...for me this is 192.168.2.30). I actually got about 910000 packets...slightly better throughput...and this is without jetson_clocks.sh. With jetson_clocks.sh the throughput did not seem to change. Unless network software is doing something smart and not actually routing through the NIC hardware (and it might actually be that smart, I don't know) this also implies that the driver, when running, does what it should if not talking to the protocol stacks (I don't consider the work of ICMP significant enough to compare to a TCP stack). [i]Perhaps it is the throughput between Realtek driver and protocol stack which is bottlenecked.[/i] [u]Each of the following are between x86_64 host and TX1:[/u] Here are some client/server side commands used (I reverse which side's address is involved if I reverse roles): [code]sudo iperf3 -c 192.168.2.2 -p 12345 -t 60 -i 10 -b 1G -R sudo iperf3 -s -p 12345[/code] [u]No jetson_clocks.sh, server on host:[/u] I see no retries and roughly 492 Mbits/sec. [u]With jetson_clocks.sh, server on host:[/u] I see no retries and roughly 492 Mbits/sec. Implies: jetson_clocks.sh makes no difference on speed, and no retries needed either way. [u]No jetson_clocks.sh, server on TX1:[/u] I see lots of retries, and roughly 639Mbits/sec. [u]With jetson_clocks.sh, server on TX1:[/u] I see lots of retries, and roughly 652Mbits/sec. Implies: Marginal throughput improvement. Retries did not significantly change. [u]ifconfig errors:[/u] When testing is done I see no errors, drops, overruns, etc., on the TX1 side. I see a very large number of dropped RX packets on the host, but no outright errors. Note that a dropped packet is correct behavior for UDP during congestion, or just from sending faster than the packets can be used (this isn't a software error per se, but it is a weak link in the chain if something is bottlenecking). TCP also can have dropped packets, but it would retry. It doesn't mean something isn't wrong, but it does mean that within its abilities the network is behaving as it should if the retries were a case of congestion. iperf3 is essentially trying to congest the network and measure congestion. I rebooted the TX1 and re-ran both client and server sides while monitoring ifconfig. No jetson_clocks.sh was used. I got no drops on the TX1. I used jetson_clocks.sh on the TX1. I re-ran (no reboot) both client and server side again on the TX1. Still, the TX1 does not show any drops. Apparently [i]it is only the host side which is seeing RX drops.[/i] To bring things together from the real world I decided to copy data over the network via netcat. This will simply send as fast as it can and receive as fast as it can. I didn't want to depend on disk read speed or write speed, so I'm using other sources and destinations. If you run this command it will read the rootfs partition and redirect it to "/dev/null" and show a time measurement: [code]# time dd if=/dev/mmcblk0p1 bs=512 > /dev/null 29859840+0 records in 29859840+0 records out 15288238080 bytes (15 GB, 14 GiB) copied, 70.6154 s, [b]216 MB/s[/b] real 1m10.619s user 0m6.728s sys 0m28.024s[/code] ...216 MB/s (1728Mbit/s). This exceeds gigabit. The important thing is to know this contains 15288238080 bytes. To simplify this: [code]# time cat /dev/mmcblk0p1 > /dev/null real 1m9.582s user 0m0.024s sys 0m8.576s [/code] ...cutting out dd shows 219715416 bytes/s, or 209.5MB/s (1676Mbit/s...also exceeding gigabit). So we know however we read the raw mmcblk0p1 we get enough throughput to exceed gigabit. To use netcat read from port 12345 I do this (it saves into "/dev/null"...in other words, it just discards bytes): [code]nc -p 12345 -l > /dev/null[/code] ...restart this after each send completes. To send mmcblk0p1 over port 12345 without touching the Realtek NIC I use: [code]# time nc -q 0 127.0.0.1 12345 < /dev/mmcblk0p1 real 1m10.492s user 0m3.196s sys 0m38.904s [/code] ...this is only one second longer than without netcat. Everything associated with networking, when purely in software, is quite good. Now lets do this again over the NIC (192.168.2.30 for me is the NIC): [code]# time nc -q 0 192.168.2.30 12345 < /dev/mmcblk0p1 real 1m10.976s user 0m2.960s sys 0m39.224s[/code] ...this appears to have almost no overhead when running through the NIC. Once again though, I do not know if the kernel is optimizing when it knows it is local traffic. So I'll do this between host and TX1 where I send from TX1 to host (adjust addresses and where the listener runs as required): [code]# time nc -q 0 192.168.2.2 12345 < /dev/mmcblk0p1 real 3m16.854s user 0m2.912s sys 0m47.676s[/code] ...clearly, talking to the outside world has a dramatic penalty even when the two are directly connected on the same switch. The actual throughput here is approximately 77684136 bytes/s (around 593Mbit/s). It happens that there is another reason why I used mmcblk0p1. My host already has this file on it as the system.img.raw. So I can copy the same number of bytes back to the Jetson in the reverse process. Keep in mind that the first time you read a file on a system with lots of RAM it may cache it, and the second time would be faster. Regardless, the rate with or without cache will far exceed gigabit, so it should be a good repeatable test. So I run the listen on the Jetson this time, and send system.img.raw to the Jetson this time (I don't use the "-q 0" on host because it is Fedora and doesn't use this): [code]# time nc 192.168.2.30 12345 < system.img.raw real 4m7.859s user 0m9.581s sys 0m52.392s [/code] ...clearly the TX1 receives slower than it sends when there is a remote host involved. The loss of throughput is real. The problem is that when doing all of this directly on the Jetson the same loss of throughput is not seen. The problem isn't with the Realtek driver, nor with the TCP stack, nor is it with how the driver is running. Something else is getting in the way and is perhaps an interaction between two parts of the software which does not show up on individual software or driver tests. An example is that additional ARP and negotiations go on between a remote host versus localhost or the NIC on the local machine. In no case did the ifconfig on eth0 of the Jetson ever show any drops or errors of any kind. I suspect the previously seen drops were from UDP. So now I'll force UDP. Listing on the TX1: [code]# nc -p 12345 -l -u > /dev/null[/code] Sending on the host to the TX1: [code]# time nc -u 192.168.2.30 12345 < system.img.raw real 3m44.611s user 0m8.875s sys 0m49.094s[/code] ...this works out to about 64.9MB/s, or 519Mbit/s. Sending from host to Jetson is slower than the other direction, but it isn't as dramatic as what shows up under iperf3. I believe someone needs to throw a network analyzer between the outside host and Jetson and run either netcat or iperf3 to see where any inefficiencies are. It gets too complicated without this and there is no clear single cause. Perhaps it is something simple like MTU/MRU behavior or an interaction from two things occurring simultaneously being an issue, yet not being an issue one at a time.
Sorry, this is really long :P

I'm just adding some observations...not necessarily anything in particular or in any order. Not all tests seem consistent. There is no real conclusion in this, but there are a lot of tests which may be surprising. It shows a few cases with the obvious possibilities are not the problem.

My TX1 is running a fully updated L4T R28.1.

I am running as root with "sudo -s".

Some of my testing leads to questions I can't answer. As a very basic test I tried to see more basic results...I ran a flood ping while jetson_clocks.sh was at full speed ("sudo ping -f whereever") for 30 seconds from host to Jetson, and then from Jetson to host. Both resulted in no loss and right around 50000 packets. This does not involve TCP or UDP (it's ICMP) and is more closely related to the physical layer (and ARP) working correctly (without this being correct TCP and UDP would both inherit a faulty environment). No error, drop, overrun, collision, etc., ever occurs from flood ping. This tends to place issues with the higher level protocol stacks (hardware drivers work in lower levels on CPU0, software drivers implement stacks on any CPU core...stacks are limited by throughput of data feeding them or being consumed).

I see that the "-b 1G" argument to iperf3 is not actually listed as valid in the man page, but "iperf3 --help" does show this. I tried iperf3 with "-b 1X" just to see if it showed an error, and it does not (I consider it a bug that an invalid argument is not an error). This calls into question whether the 1G bitrate is really doing as expected. I don't have a network analyzer so I couldn't say. Probably 1G is supported...but then again, perhaps it is supported just on arm64 or just on x86_64. I don't know.

So I did something not yet done in order to isolate where limitations are coming from. I ran iperf3 as both server and client on the TX1 (which also implies the 1G speed is guaranteed to be the same for behavior of both client or server mode...there is no arm64/x86_64 difference possible). I used localhost address 127.0.0.1. This avoids going through the Realtek driver and hardware. This still uses protocol stacks (keep in mind ping doesn't care much about protocol stacks, UDP and TCP do). I had a throughput of approximately 999Mbits/sec, with no retries (this verifies "-b 1G" works as expected). This loopback interface can bypass CPU0 since no hardware drivers are involved. I'd say protocol stacks (iperf3 is using both UDP and TCP) and purely software side is at full performance (at least when limited to 1G speed...cutting out hardware seems to reach theoretical maximum). I tend to favor saying there is an issue with either the Realtek driver or the time the Realtek driver is having available for running (the latency before a hardware IRQ begins service or the time used during service of the IRQ is implied as the limitation). It's hard to know without profiling, and I have no way to hardware profile.

I then ran a flood ping on the TX1 to 127.0.0.1 for 30 seconds (a ping going to itself without touching the NIC). Approximately 900000 packets were serviced without loss. The network software, when not going through network hardware, is about 18:1 faster.

Next I ran a 30 second flood ping to the address of the local NIC on the TX1 (both send and receive are serviced by the same NIC and driver...for me this is 192.168.2.30). I actually got about 910000 packets...slightly better throughput...and this is without jetson_clocks.sh. With jetson_clocks.sh the throughput did not seem to change. Unless network software is doing something smart and not actually routing through the NIC hardware (and it might actually be that smart, I don't know) this also implies that the driver, when running, does what it should if not talking to the protocol stacks (I don't consider the work of ICMP significant enough to compare to a TCP stack). Perhaps it is the throughput between Realtek driver and protocol stack which is bottlenecked.

Each of the following are between x86_64 host and TX1:

Here are some client/server side commands used (I reverse which side's address is involved if I reverse roles):
sudo iperf3 -c 192.168.2.2 -p 12345 -t 60 -i 10 -b 1G -R
sudo iperf3 -s -p 12345


No jetson_clocks.sh, server on host:
I see no retries and roughly 492 Mbits/sec.

With jetson_clocks.sh, server on host:
I see no retries and roughly 492 Mbits/sec.

Implies: jetson_clocks.sh makes no difference on speed, and no retries needed either way.

No jetson_clocks.sh, server on TX1:
I see lots of retries, and roughly 639Mbits/sec.

With jetson_clocks.sh, server on TX1:
I see lots of retries, and roughly 652Mbits/sec.

Implies: Marginal throughput improvement. Retries did not significantly change.

ifconfig errors:
When testing is done I see no errors, drops, overruns, etc., on the TX1 side. I see a very large number of dropped RX packets on the host, but no outright errors.

Note that a dropped packet is correct behavior for UDP during congestion, or just from sending faster than the packets can be used (this isn't a software error per se, but it is a weak link in the chain if something is bottlenecking). TCP also can have dropped packets, but it would retry. It doesn't mean something isn't wrong, but it does mean that within its abilities the network is behaving as it should if the retries were a case of congestion. iperf3 is essentially trying to congest the network and measure congestion.

I rebooted the TX1 and re-ran both client and server sides while monitoring ifconfig. No jetson_clocks.sh was used. I got no drops on the TX1.

I used jetson_clocks.sh on the TX1. I re-ran (no reboot) both client and server side again on the TX1. Still, the TX1 does not show any drops. Apparently it is only the host side which is seeing RX drops.

To bring things together from the real world I decided to copy data over the network via netcat. This will simply send as fast as it can and receive as fast as it can. I didn't want to depend on disk read speed or write speed, so I'm using other sources and destinations.

If you run this command it will read the rootfs partition and redirect it to "/dev/null" and show a time measurement:
# time dd if=/dev/mmcblk0p1 bs=512 > /dev/null
29859840+0 records in
29859840+0 records out
15288238080 bytes (15 GB, 14 GiB) copied, 70.6154 s, 216 MB/s

real 1m10.619s
user 0m6.728s
sys 0m28.024s

...216 MB/s (1728Mbit/s). This exceeds gigabit. The important thing is to know this contains 15288238080 bytes.

To simplify this:
# time cat /dev/mmcblk0p1 > /dev/null

real 1m9.582s
user 0m0.024s
sys 0m8.576s

...cutting out dd shows 219715416 bytes/s, or 209.5MB/s (1676Mbit/s...also exceeding gigabit). So we know however we read the raw mmcblk0p1 we get enough throughput to exceed gigabit.

To use netcat read from port 12345 I do this (it saves into "/dev/null"...in other words, it just discards bytes):
nc -p 12345 -l > /dev/null

...restart this after each send completes.

To send mmcblk0p1 over port 12345 without touching the Realtek NIC I use:
# time nc -q 0 127.0.0.1 12345 < /dev/mmcblk0p1

real 1m10.492s
user 0m3.196s
sys 0m38.904s

...this is only one second longer than without netcat. Everything associated with networking, when purely in software, is quite good.

Now lets do this again over the NIC (192.168.2.30 for me is the NIC):
# time nc -q 0 192.168.2.30 12345 < /dev/mmcblk0p1

real 1m10.976s
user 0m2.960s
sys 0m39.224s

...this appears to have almost no overhead when running through the NIC. Once again though, I do not know if the kernel is optimizing when it knows it is local traffic.

So I'll do this between host and TX1 where I send from TX1 to host (adjust addresses and where the listener runs as required):
# time nc -q 0 192.168.2.2 12345 < /dev/mmcblk0p1

real 3m16.854s
user 0m2.912s
sys 0m47.676s

...clearly, talking to the outside world has a dramatic penalty even when the two are directly connected on the same switch. The actual throughput here is approximately 77684136 bytes/s (around 593Mbit/s).

It happens that there is another reason why I used mmcblk0p1. My host already has this file on it as the system.img.raw. So I can copy the same number of bytes back to the Jetson in the reverse process. Keep in mind that the first time you read a file on a system with lots of RAM it may cache it, and the second time would be faster. Regardless, the rate with or without cache will far exceed gigabit, so it should be a good repeatable test.

So I run the listen on the Jetson this time, and send system.img.raw to the Jetson this time (I don't use the "-q 0" on host because it is Fedora and doesn't use this):
# time nc 192.168.2.30 12345 < system.img.raw

real 4m7.859s
user 0m9.581s
sys 0m52.392s

...clearly the TX1 receives slower than it sends when there is a remote host involved. The loss of throughput is real. The problem is that when doing all of this directly on the Jetson the same loss of throughput is not seen. The problem isn't with the Realtek driver, nor with the TCP stack, nor is it with how the driver is running. Something else is getting in the way and is perhaps an interaction between two parts of the software which does not show up on individual software or driver tests. An example is that additional ARP and negotiations go on between a remote host versus localhost or the NIC on the local machine.

In no case did the ifconfig on eth0 of the Jetson ever show any drops or errors of any kind. I suspect the previously seen drops were from UDP. So now I'll force UDP.

Listing on the TX1:
# nc -p 12345 -l -u > /dev/null


Sending on the host to the TX1:
# time nc -u 192.168.2.30 12345 < system.img.raw

real 3m44.611s
user 0m8.875s
sys 0m49.094s

...this works out to about 64.9MB/s, or 519Mbit/s. Sending from host to Jetson is slower than the other direction, but it isn't as dramatic as what shows up under iperf3.

I believe someone needs to throw a network analyzer between the outside host and Jetson and run either netcat or iperf3 to see where any inefficiencies are. It gets too complicated without this and there is no clear single cause. Perhaps it is something simple like MTU/MRU behavior or an interaction from two things occurring simultaneously being an issue, yet not being an issue one at a time.

#14
Posted 03/05/2018 10:45 PM   
Hi guys, Sorry for late reply. This thread looks so long that I may not clearly know current status. Does this issue reproduce on tx1 devkit? I saw there are some Auvidea J120 carrier boards.
Hi guys,

Sorry for late reply. This thread looks so long that I may not clearly know current status.


Does this issue reproduce on tx1 devkit? I saw there are some Auvidea J120 carrier boards.

#15
Posted 03/06/2018 03:50 AM   
Scroll To Top

Add Reply