Cannot achieve USB3.0 throughput
Hello! My goal is to achieve the highest possible data throughput as possible using the provided USB3 port on my TX1 board (Stock JetPack 3.1 kernel 4.4.38). After observing low performance with my controller I decided to perform a speed test with using some regular OEM SSD. As a first thing I tried connecting SSD directly to that port and copying a large file from it (19GB) . The copy process starts at ~100MB/sec and drops to ~40MB/sec after few seconds, where it remains, bouncing from 30 to 50 MB/sec. The same scenario produces stable ~250MB/sec on my Desktop Linux. Could you please suggest whether there are any special instructions required in order to optimize system/USB port performance for that? * I tried disabling auto suspend modes as suggested in this article ( http://www.jetsonhacks.com/2015/05/27/usb-autosuspend-nvidia-jetson-tk1/ ) * I found TK1 related article suggesting to change the ODMDATA value, however, I couldn't find anything similar for TX1 ( https://devtalk.nvidia.com/default/topic/751916/embedded-systems/microsoft-kinect-with-jetson-tk1/post/4239031/ ) Regards, Leffe
Hello!

My goal is to achieve the highest possible data throughput as possible using the provided USB3 port on my TX1 board (Stock JetPack 3.1 kernel 4.4.38).
After observing low performance with my controller I decided to perform a speed test with using some regular OEM SSD.
As a first thing I tried connecting SSD directly to that port and copying a large file from it (19GB) .
The copy process starts at ~100MB/sec and drops to ~40MB/sec after few seconds, where it remains, bouncing from 30 to 50 MB/sec. The same scenario produces stable ~250MB/sec on my Desktop Linux.

Could you please suggest whether there are any special instructions required in order to optimize system/USB port performance for that?

* I tried disabling auto suspend modes as suggested in this article ( http://www.jetsonhacks.com/2015/05/27/usb-autosuspend-nvidia-jetson-tk1/ )
* I found TK1 related article suggesting to change the ODMDATA value, however, I couldn't find anything similar for TX1 ( https://devtalk.nvidia.com/default/topic/751916/embedded-systems/microsoft-kinect-with-jetson-tk1/post/4239031/ )


Regards,
Leffe

#1
Posted 01/04/2018 04:16 PM   
I'd first make sure it isn't throttling back for energy savings. I'm assuming R28.1 (but this applies to many earlier releases): [code]sudo /home/ubuntu/tegra_clocks.sh[/code]
I'd first make sure it isn't throttling back for energy savings. I'm assuming R28.1 (but this applies to many earlier releases):
sudo /home/ubuntu/tegra_clocks.sh

#2
Posted 01/04/2018 05:20 PM   
Thank you for quick reply. Yes, I already tried that before starting the test and here is the output: [code]nvidia@tegra-ubuntu:~$ sudo /home/nvidia/jetson_clocks.sh --show SOC family:tegra210 Machine:jetson_tx1 Online CPUs: 0-3 CPU Cluster Switching: Disabled cpu0: Gonvernor=interactive MinFreq=1734000 MaxFreq=1734000 CurrentFreq=1734000 cpu1: Gonvernor=interactive MinFreq=1734000 MaxFreq=1734000 CurrentFreq=1734000 cpu2: Gonvernor=interactive MinFreq=1734000 MaxFreq=1734000 CurrentFreq=1734000 cpu3: Gonvernor=interactive MinFreq=1734000 MaxFreq=1734000 CurrentFreq=1734000 GPU MinFreq=998400000 MaxFreq=998400000 CurrentFreq=998400000 EMC MinFreq=12750000 MaxFreq=1600000000 CurrentFreq=1600000000 FreqOverride=1 Fan: speed=255 nvidia@tegra-ubuntu:~$[/code]
Thank you for quick reply.
Yes, I already tried that before starting the test and here is the output:

nvidia@tegra-ubuntu:~$ sudo /home/nvidia/jetson_clocks.sh --show
SOC family:tegra210 Machine:jetson_tx1
Online CPUs: 0-3
CPU Cluster Switching: Disabled
cpu0: Gonvernor=interactive MinFreq=1734000 MaxFreq=1734000 CurrentFreq=1734000
cpu1: Gonvernor=interactive MinFreq=1734000 MaxFreq=1734000 CurrentFreq=1734000
cpu2: Gonvernor=interactive MinFreq=1734000 MaxFreq=1734000 CurrentFreq=1734000
cpu3: Gonvernor=interactive MinFreq=1734000 MaxFreq=1734000 CurrentFreq=1734000
GPU MinFreq=998400000 MaxFreq=998400000 CurrentFreq=998400000
EMC MinFreq=12750000 MaxFreq=1600000000 CurrentFreq=1600000000 FreqOverride=1
Fan: speed=255
nvidia@tegra-ubuntu:~$

#3
Posted 01/04/2018 06:26 PM   
This is just an experiment, it might or might not shed some light on the topic. Try installing "htop" ("sudo apt-get install htop"). Then run this as root and look only at one specific user's processes...I'm going to assume the copy is being done by user ubuntu: [code]sudo htop -u ubuntu[/code] Notice in the lower line of htop there is a description for hot keys. One of those is the "F7" key accessing "nice" levels (process priorities). A normal process has "nice" of 0 (by default htop lists this as a column). A process with higher priority (not as nice to other processes) has a negative number...you do not want to go more negative than perhaps -2 unless you really know what you are doing (you can get priority inversions). Start a long copy as user ubuntu which will take quite some time, and when you see that process show up on htop, arrow up or down to the process, hit the F7 key, and renice to "-1". See if throughput goes up. Although the "nice" command exists on command line for launching a new command with an altered priority you can only give higher priority if you operate as user root (thus why I said use "sudo" for htop launch). If you "cat" a file to "/dev/null" you'll actually also have the driver for "/dev/null" involved, so it isn't a good test, but an example of this could be replaced using cp instead of cat. A sample for an entire partition: [code]# In the first terminal: sudo htop -u root # In a second terminal: sudo nice --1 cat /dev/mmcblk0p1 > /dev/null # NOTE: the nice argument adds to nice level, -1 is just "add 1"..."--1" is subtract 1[/code] A similar test, perhaps better because it doesn't involve file system drivers in reading the drive (I am assuming your USB drive is "/dev/sda"): [code]# Copies the entire disk as raw bytes into nothing 4096 bytes at a time. sudo dd if=/dev/sda of=/dev/null bs=4096[/code] There are also of course benchmarking tools (and you probably used one) which can be launched with "nice --1". I suspect a benchmark tool copies into RAM and drops the data...if it actually writes to the disk during a copy you're limited by the slower of USB, disk read speed, and disk write speed. Probably you'd see only disk write speed as the weak link. You might consider looking for a real benchmarking tool. Using "/dev/null" is probably better as a destination than a file, but not necessarily as good as it might seem. If a process has been give one level of priority more (a nice of "-1"), then you will immediately see speed increases over time if there was competition from other user space processes. You could try it as -2, but don't go any more negative. If -1 doesn't help, then it is unlikely -2 will change things much anyway.
This is just an experiment, it might or might not shed some light on the topic. Try installing "htop" ("sudo apt-get install htop"). Then run this as root and look only at one specific user's processes...I'm going to assume the copy is being done by user ubuntu:
sudo htop -u ubuntu


Notice in the lower line of htop there is a description for hot keys. One of those is the "F7" key accessing "nice" levels (process priorities). A normal process has "nice" of 0 (by default htop lists this as a column). A process with higher priority (not as nice to other processes) has a negative number...you do not want to go more negative than perhaps -2 unless you really know what you are doing (you can get priority inversions). Start a long copy as user ubuntu which will take quite some time, and when you see that process show up on htop, arrow up or down to the process, hit the F7 key, and renice to "-1". See if throughput goes up.

Although the "nice" command exists on command line for launching a new command with an altered priority you can only give higher priority if you operate as user root (thus why I said use "sudo" for htop launch). If you "cat" a file to "/dev/null" you'll actually also have the driver for "/dev/null" involved, so it isn't a good test, but an example of this could be replaced using cp instead of cat. A sample for an entire partition:
# In the first terminal:
sudo htop -u root
# In a second terminal:
sudo nice --1 cat /dev/mmcblk0p1 > /dev/null
# NOTE: the nice argument adds to nice level, -1 is just "add 1"..."--1" is subtract 1


A similar test, perhaps better because it doesn't involve file system drivers in reading the drive (I am assuming your USB drive is "/dev/sda"):
# Copies the entire disk as raw bytes into nothing 4096 bytes at a time.
sudo dd if=/dev/sda of=/dev/null bs=4096


There are also of course benchmarking tools (and you probably used one) which can be launched with "nice --1". I suspect a benchmark tool copies into RAM and drops the data...if it actually writes to the disk during a copy you're limited by the slower of USB, disk read speed, and disk write speed. Probably you'd see only disk write speed as the weak link. You might consider looking for a real benchmarking tool. Using "/dev/null" is probably better as a destination than a file, but not necessarily as good as it might seem.

If a process has been give one level of priority more (a nice of "-1"), then you will immediately see speed increases over time if there was competition from other user space processes. You could try it as -2, but don't go any more negative. If -1 doesn't help, then it is unlikely -2 will change things much anyway.

#4
Posted 01/04/2018 06:53 PM   
Thank you for the suggestion! I will try and report back as soon as I overcome a more severe issue, which I described in a separate topic.
Thank you for the suggestion!
I will try and report back as soon as I overcome a more severe issue, which I described in a separate topic.

#5
Posted 01/05/2018 06:25 PM   
There is an improvement observed after enabling the clocks reaching ~132 MB/sec Unfortunately setting niceness level achieved almost no effect (please see a screenshot here with htop output -- https://imgur.com/a/nyzNj) However, the speed still remains well below USB SuperSpeed limit and I'd like to know how to achieve better results. Any further guidance is highly appreciated.
There is an improvement observed after enabling the clocks reaching ~132 MB/sec
Unfortunately setting niceness level achieved almost no effect (please see a screenshot here with htop output -- https://imgur.com/a/nyzNj)

However, the speed still remains well below USB SuperSpeed limit and I'd like to know how to achieve better results.
Any further guidance is highly appreciated.

#6
Posted 01/13/2018 05:40 PM   
What was your test for this last run?
What was your test for this last run?

#7
Posted 01/13/2018 06:26 PM   
[quote=""]What was your test for this last run?[/quote] For this particular scenario I used "sudo dd if=/dev/sda of=/dev/null bs=4096" as suggested above with different niceness level. I also tried in several other ways such as rsync and filling the USB with data from my controller from but it's always about the same value.. As mentioned in my initial post, exactly the same scenarios yield good results in Desktop Ubuntu.
said:What was your test for this last run?

For this particular scenario I used "sudo dd if=/dev/sda of=/dev/null bs=4096" as suggested above with different niceness level.
I also tried in several other ways such as rsync and filling the USB with data from my controller from but it's always about the same value..
As mentioned in my initial post, exactly the same scenarios yield good results in Desktop Ubuntu.

#8
Posted 01/13/2018 06:39 PM   
The trouble with this is that most likely the limit is from the drive itself. I used that as an example because I didn't have anything else on USB which might serve as a test. For example, a USB gigabit network card might be a better test...but this too would be the most likely weak link in the chain for USB3. Anyone with suggestions on what to use to test the actual USB3 throughput with without spending a lot of money on it (if a USB3 10g-base-t exists I'm sure it costs a small fortune)? A hard drive would not normally be fast enough to challenge USB3.
Answer Accepted by Forum Admin
The trouble with this is that most likely the limit is from the drive itself. I used that as an example because I didn't have anything else on USB which might serve as a test. For example, a USB gigabit network card might be a better test...but this too would be the most likely weak link in the chain for USB3.

Anyone with suggestions on what to use to test the actual USB3 throughput with without spending a lot of money on it (if a USB3 10g-base-t exists I'm sure it costs a small fortune)? A hard drive would not normally be fast enough to challenge USB3.

#9
Posted 01/13/2018 06:59 PM   
Scroll To Top

Add Reply