NeMo installation - done; performance to be evaluated

Two below seems impossible to install through pip, but to build from sources.

pip3 install onnxruntime sentencepiece

However, excluded them from all requirements.txt files and installed the package as a python module also

Does below relative?

exactly!
it runs in a container, natively;
However, the container is for x86_64, as I understand, isnā€™t it?
I am trying to install it system-wide on Xavier
now I got to build onnx wheel from source

export ONNX_ML=1
python3 setup.py bdist_wheel
pip3 install --upgrade dist/*.whl

Have a check this topic.

https://devtalk.nvidia.com/default/topic/1069993

one more topic found:
https://devtalk.nvidia.com/default/topic/1070793/other-tools/nemo-asr-fails-to-build-on-xavier-jetpack/

However, I will try to assemble it at Xavier manually; From the thread, you pointed to it appears that docker scenario wonā€™t work.
Building LLVM 9.01:

wget https://github.com/llvm/llvm-project/releases/download/llvmorg-9.0.1/llvm-9.0.1.src.tar.xz
 tar -xzf llvm-9.0.1.src.tar.xz
 cd llvm-9.0.1.src/
mkdir build
cd build
 cmake ../ -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD="ARM;X86;AArch64"
make -j8
sudo make install
sudo ldconfig

To be continued;

pip3 install git+https://github.com/jschueller/llvmlite.git@patch-1

the patch removes version check and installs lvmlite that wouldnā€™t otherwise install due to 9 version of LLVM

Some references just in case some other folks will decide to try doing that:
https://github.com/microsoft/onnxruntime/blob/master/BUILD.md
https://github.com/microsoft/onnxruntime/blob/master/BUILD.md#ARM
https://github.com/google/sentencepiece/blob/master/python/README.md
https://github.com/google/sentencepiece#c-from-source
https://devtalk.nvidia.com/default/topic/1049071/jetson-nano/pytorch-for-jetson-nano-version-1-4-0-now-available/post/5428628/#5428628
https://github.com/microsoft/onnxruntime/issues/2684#issuecomment-568548387

Done? It will require to check if it works;

nvidia@nvidia-desktop:~/NeMo$ ./reinstall.sh 
Uninstalling stuff
WARNING: Skipping nemo-toolkit as it is not installed.
WARNING: Skipping nemo-asr as it is not installed.
WARNING: Skipping nemo-nlp as it is not installed.
WARNING: Skipping nemo-tts as it is not installed.
WARNING: Skipping nemo-simple-gan as it is not installed.
Installing stuff
Defaulting to user installation because normal site-packages is not writeable
Obtaining file:///home/nvidia/NeMo
Requirement already satisfied: onnx in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (1.6.0)
Requirement already satisfied: pandas in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (1.0.1)
Requirement already satisfied: python-dateutil in /usr/lib/python3/dist-packages (from nemo-toolkit==0.9.0) (2.6.1)
Requirement already satisfied: tensorboardX in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (2.0)
Requirement already satisfied: tensorboard in /usr/local/lib/python3.6/dist-packages (from nemo-toolkit==0.9.0) (2.0.2)
Requirement already satisfied: torch in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (1.4.0)
Requirement already satisfied: torchvision in /usr/local/lib/python3.6/dist-packages/torchvision-0.5.0a0+85b8fbf-py3.6-linux-aarch64.egg (from nemo-toolkit==0.9.0) (0.5.0a0+85b8fbf)
Requirement already satisfied: wget in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (3.2)
Requirement already satisfied: wrapt in /usr/local/lib/python3.6/dist-packages (from nemo-toolkit==0.9.0) (1.11.2)
Requirement already satisfied: boto3 in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (1.11.17)
Requirement already satisfied: frozendict in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (1.2)
Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from nemo-toolkit==0.9.0) (2.9.0)
Requirement already satisfied: html2text in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (2020.1.16)
Requirement already satisfied: inflect in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (4.1.0)
Requirement already satisfied: ipdb in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (0.12.3)
Requirement already satisfied: ipython[all] in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (7.12.0)
Requirement already satisfied: jupyterlab in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (1.2.6)
Requirement already satisfied: kaldi-io in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (0.9.1)
Processing /home/nvidia/.cache/pip/wheels/cb/1d/15/a479fa740849128d481333d2f354f97691be3e2c82480a3e00/librosa-0.7.2-py3-none-any.whl
Collecting marshmallow
  Using cached marshmallow-3.4.0-py2.py3-none-any.whl (45 kB)
Processing /home/nvidia/.cache/pip/wheels/13/89/ba/ad289fefcfcaa2cab16402f7dedfb96b7015bcb338fc8a8853/matplotlib-3.1.3-cp36-cp36m-linux_aarch64.whl
Processing /home/nvidia/.cache/pip/wheels/e3/c9/b0/ed26a73ef75a53145820825afa8e2d2c9b30fe9f6c10cd3202/nltk-3.4.5-py3-none-any.whl
Collecting num2words
  Using cached num2words-0.5.10-py3-none-any.whl (101 kB)
Requirement already satisfied: pillow>=4.3.0 in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (7.0.0)
Processing /home/nvidia/.cache/pip/wheels/bc/13/93/a9bf6b3d3966e4af014b0dbef027fdea47393faf47e990349f/progressbar-2.5-py3-none-any.whl
Requirement already satisfied: requests in /usr/local/lib/python3.6/dist-packages (from nemo-toolkit==0.9.0) (2.22.0)
Collecting ruamel.yaml
  Using cached ruamel.yaml-0.16.10-py2.py3-none-any.whl (111 kB)
Requirement already satisfied: six in /usr/lib/python3/dist-packages (from nemo-toolkit==0.9.0) (1.11.0)
Collecting sox
  Using cached sox-1.3.7-py2.py3-none-any.whl (34 kB)
Requirement already satisfied: tqdm in /home/nvidia/.local/lib/python3.6/site-packages (from nemo-toolkit==0.9.0) (4.41.1)
Collecting unidecode
  Using cached Unidecode-1.1.1-py2.py3-none-any.whl (238 kB)
Processing /home/nvidia/.cache/pip/wheels/12/12/87/7c733c4b31d9929b1ba04b70a4fbb73e7e026ac2364475c5d5/youtokentome-1.0.6-cp36-cp36m-linux_aarch64.whl
Collecting parameterized
  Using cached parameterized-0.7.1-py2.py3-none-any.whl (24 kB)
Collecting pytest
  Using cached pytest-5.3.5-py3-none-any.whl (235 kB)
Collecting pytest-runner
  Using cached pytest_runner-5.2-py2.py3-none-any.whl (6.8 kB)
Collecting black
  Using cached black-19.10b0-py36-none-any.whl (97 kB)
Collecting isort[requirements]
  Using cached isort-4.3.21-py2.py3-none-any.whl (42 kB)
Collecting soundfile
  Using cached SoundFile-0.10.3.post1-py2.py3-none-any.whl (21 kB)
Collecting torch-stft
  Using cached torch_stft-0.1.4-py3-none-any.whl (6.2 kB)
Collecting torchtext
  Using cached torchtext-0.5.0-py3-none-any.whl (73 kB)
Collecting transformers
  Using cached transformers-2.4.1-py3-none-any.whl (475 kB)
Collecting pypinyin
  Using cached pypinyin-0.37.0-py2.py3-none-any.whl (779 kB)
Requirement already satisfied: scipy in /home/nvidia/.local/lib/python3.6/site-packages/scipy-1.5.0.dev0+566ce19-py3.6-linux-aarch64.egg (from nemo-toolkit==0.9.0) (1.5.0.dev0+566ce19)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from onnx->nemo-toolkit==0.9.0) (1.16.1)
Requirement already satisfied: typing-extensions>=3.6.2.1 in /home/nvidia/.local/lib/python3.6/site-packages (from onnx->nemo-toolkit==0.9.0) (3.7.4.1)
Requirement already satisfied: protobuf in /usr/local/lib/python3.6/dist-packages (from onnx->nemo-toolkit==0.9.0) (3.11.2)
Requirement already satisfied: pytz>=2017.2 in /usr/lib/python3/dist-packages (from pandas->nemo-toolkit==0.9.0) (2018.3)
Requirement already satisfied: wheel>=0.26; python_version >= "3" in /home/nvidia/.local/lib/python3.6/site-packages (from tensorboard->nemo-toolkit==0.9.0) (0.34.2)
Requirement already satisfied: setuptools>=41.0.0 in /home/nvidia/.local/lib/python3.6/site-packages (from tensorboard->nemo-toolkit==0.9.0) (45.2.0)
Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.6/dist-packages (from tensorboard->nemo-toolkit==0.9.0) (0.9.0)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.6/dist-packages (from tensorboard->nemo-toolkit==0.9.0) (0.16.0)
Requirement already satisfied: grpcio>=1.24.3 in /usr/local/lib/python3.6/dist-packages (from tensorboard->nemo-toolkit==0.9.0) (1.26.0)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.6/dist-packages (from tensorboard->nemo-toolkit==0.9.0) (0.4.1)
Requirement already satisfied: google-auth<2,>=1.6.3 in /usr/local/lib/python3.6/dist-packages (from tensorboard->nemo-toolkit==0.9.0) (1.10.1)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard->nemo-toolkit==0.9.0) (3.1.1)
Requirement already satisfied: botocore<1.15.0,>=1.14.17 in /home/nvidia/.local/lib/python3.6/site-packages (from boto3->nemo-toolkit==0.9.0) (1.14.17)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /home/nvidia/.local/lib/python3.6/site-packages (from boto3->nemo-toolkit==0.9.0) (0.9.4)
Requirement already satisfied: s3transfer<0.4.0,>=0.3.0 in /home/nvidia/.local/lib/python3.6/site-packages (from boto3->nemo-toolkit==0.9.0) (0.3.3)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /home/nvidia/.local/lib/python3.6/site-packages (from inflect->nemo-toolkit==0.9.0) (1.5.0)
Requirement already satisfied: traitlets>=4.2 in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (4.3.3)
Requirement already satisfied: jedi>=0.10 in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (0.16.0)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (3.0.3)
Requirement already satisfied: decorator in /usr/lib/python3/dist-packages (from ipython[all]->nemo-toolkit==0.9.0) (4.1.2)
Requirement already satisfied: pickleshare in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (0.7.5)
Requirement already satisfied: pygments in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (2.5.2)
Requirement already satisfied: pexpect; sys_platform != "win32" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (4.8.0)
Requirement already satisfied: backcall in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (0.1.0)
Requirement already satisfied: ipyparallel; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (6.2.4)
Requirement already satisfied: ipykernel; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (5.1.4)
Requirement already satisfied: nbformat; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (5.0.4)
Requirement already satisfied: nose>=0.10.1; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (1.3.7)
Requirement already satisfied: nbconvert; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (5.6.1)
Requirement already satisfied: testpath; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (0.4.4)
Requirement already satisfied: qtconsole; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (4.6.0)
Requirement already satisfied: notebook; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (6.0.3)
Requirement already satisfied: Sphinx>=1.3; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (2.4.1)
Requirement already satisfied: ipywidgets; extra == "all" in /home/nvidia/.local/lib/python3.6/site-packages (from ipython[all]->nemo-toolkit==0.9.0) (7.5.1)
Requirement already satisfied: tornado!=6.0.0,!=6.0.1,!=6.0.2 in /home/nvidia/.local/lib/python3.6/site-packages (from jupyterlab->nemo-toolkit==0.9.0) (6.0.3)
Requirement already satisfied: jinja2>=2.10 in /home/nvidia/.local/lib/python3.6/site-packages (from jupyterlab->nemo-toolkit==0.9.0) (2.11.1)
Requirement already satisfied: jupyterlab-server~=1.0.0 in /home/nvidia/.local/lib/python3.6/site-packages (from jupyterlab->nemo-toolkit==0.9.0) (1.0.6)
Requirement already satisfied: scikit-learn!=0.19.0,>=0.14.0 in /home/nvidia/.local/lib/python3.6/site-packages (from librosa->nemo-toolkit==0.9.0) (0.22.1)
Processing /home/nvidia/.cache/pip/wheels/ba/34/a2/cd1e28caa8e3eda2f77247fcca9c7cf7676800a8807dfe6468/numba-0.48.0-cp36-cp36m-linux_aarch64.whl
Requirement already satisfied: audioread>=2.0.0 in /home/nvidia/.local/lib/python3.6/site-packages (from librosa->nemo-toolkit==0.9.0) (2.1.8)
Requirement already satisfied: joblib>=0.12 in /home/nvidia/.local/lib/python3.6/site-packages (from librosa->nemo-toolkit==0.9.0) (0.14.1)
Processing /home/nvidia/.cache/pip/wheels/cf/d4/04/49d8824a42bd9f9b11d502727965b9997f0d41d2b22ae4f645/resampy-0.2.2-py3-none-any.whl
Processing /home/nvidia/.cache/pip/wheels/60/f6/85/a8b74867c7215481350c351dfe89ce9ee50ac603a37e4e10fa/kiwisolver-1.1.0-cp36-cp36m-linux_aarch64.whl
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/nvidia/.local/lib/python3.6/site-packages (from matplotlib->nemo-toolkit==0.9.0) (2.4.6)
Collecting cycler>=0.10
  Using cached cycler-0.10.0-py2.py3-none-any.whl (6.5 kB)
Processing /home/nvidia/.cache/pip/wheels/3f/2a/fa/4d7a888e69774d5e6e855d190a8a51b357d77cc05eb1c097c9/docopt-0.6.2-py2.py3-none-any.whl
Requirement already satisfied: idna<2.9,>=2.5 in /usr/lib/python3/dist-packages (from requests->nemo-toolkit==0.9.0) (2.6)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/lib/python3/dist-packages (from requests->nemo-toolkit==0.9.0) (1.22)
Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests->nemo-toolkit==0.9.0) (2018.1.18)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/lib/python3/dist-packages (from requests->nemo-toolkit==0.9.0) (3.0.4)
Processing /home/nvidia/.cache/pip/wheels/e2/b8/ea/62fbbd3e532ea67717f7b561bdb57717515f288f83cdf93065/ruamel.yaml.clib-0.2.0-cp36-cp36m-linux_aarch64.whl
Collecting Click>=7.0
  Using cached Click-7.0-py2.py3-none-any.whl (81 kB)
Collecting py>=1.5.0
  Using cached py-1.8.1-py2.py3-none-any.whl (83 kB)
Collecting pluggy<1.0,>=0.12
  Using cached pluggy-0.13.1-py2.py3-none-any.whl (18 kB)
Requirement already satisfied: packaging in /home/nvidia/.local/lib/python3.6/site-packages (from pytest->nemo-toolkit==0.9.0) (20.1)
Requirement already satisfied: attrs>=17.4.0 in /home/nvidia/.local/lib/python3.6/site-packages (from pytest->nemo-toolkit==0.9.0) (19.3.0)
Requirement already satisfied: wcwidth in /home/nvidia/.local/lib/python3.6/site-packages (from pytest->nemo-toolkit==0.9.0) (0.1.8)
Collecting more-itertools>=4.0.0
  Using cached more_itertools-8.2.0-py3-none-any.whl (43 kB)
Collecting appdirs
  Using cached appdirs-1.4.3-py2.py3-none-any.whl (12 kB)
Collecting toml>=0.9.4
  Using cached toml-0.10.0-py2.py3-none-any.whl (25 kB)
Processing /home/nvidia/.cache/pip/wheels/cc/53/f6/421b969c48a5d993073f7188983e8d26e73871f5ba8c7812d6/typed_ast-1.4.1-cp36-cp36m-linux_aarch64.whl
Processing /home/nvidia/.cache/pip/wheels/6d/21/14/8d92aa843fb7e938385fb3b4386a43b4b9b9375001f59ddd57/regex-2020.1.8-cp36-cp36m-linux_aarch64.whl
Collecting pathspec<1,>=0.6
  Using cached pathspec-0.7.0-py2.py3-none-any.whl (25 kB)
Collecting pip-api; extra == "requirements"
  Using cached pip_api-0.0.13-py3-none-any.whl (103 kB)
Collecting pipreqs; extra == "requirements"
  Using cached pipreqs-0.4.10-py2.py3-none-any.whl (25 kB)
Processing /home/nvidia/.cache/pip/wheels/6c/6b/5c/30ce64a958139dc1f16109f1393d258aa13057d266b58cb486/cffi-1.14.0-cp36-cp36m-linux_aarch64.whl
Requirement already satisfied: sentencepiece in /usr/local/lib/python3.6/dist-packages/sentencepiece-0.1.85-py3.6-linux-aarch64.egg (from torchtext->nemo-toolkit==0.9.0) (0.1.85)
Collecting filelock
  Using cached filelock-3.0.12-py3-none-any.whl (7.6 kB)
Processing /home/nvidia/.cache/pip/wheels/03/e9/be/8b52f6e7e8c333b56f9440575b4c5eb4d96d27b5d22df5a71e/sacremoses-0.0.38-py3-none-any.whl
Processing /home/nvidia/.cache/pip/wheels/b4/75/5e/3ea8687989c4678f5bdcc3b64fd832c93f36d9bc27dd6f9577/tokenizers-0.0.11-cp36-cp36m-linux_aarch64.whl
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard->nemo-toolkit==0.9.0) (1.3.0)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.6/dist-packages (from google-auth<2,>=1.6.3->tensorboard->nemo-toolkit==0.9.0) (4.0.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.6/dist-packages (from google-auth<2,>=1.6.3->tensorboard->nemo-toolkit==0.9.0) (0.2.8)
Requirement already satisfied: rsa<4.1,>=3.1.4 in /usr/local/lib/python3.6/dist-packages (from google-auth<2,>=1.6.3->tensorboard->nemo-toolkit==0.9.0) (4.0)
Requirement already satisfied: docutils<0.16,>=0.10 in /home/nvidia/.local/lib/python3.6/site-packages (from botocore<1.15.0,>=1.14.17->boto3->nemo-toolkit==0.9.0) (0.15.2)
Requirement already satisfied: zipp>=0.5 in /home/nvidia/.local/lib/python3.6/site-packages (from importlib-metadata; python_version < "3.8"->inflect->nemo-toolkit==0.9.0) (2.2.0)
Requirement already satisfied: ipython-genutils in /home/nvidia/.local/lib/python3.6/site-packages (from traitlets>=4.2->ipython[all]->nemo-toolkit==0.9.0) (0.2.0)
Requirement already satisfied: parso>=0.5.2 in /home/nvidia/.local/lib/python3.6/site-packages (from jedi>=0.10->ipython[all]->nemo-toolkit==0.9.0) (0.6.1)
Requirement already satisfied: ptyprocess>=0.5 in /home/nvidia/.local/lib/python3.6/site-packages (from pexpect; sys_platform != "win32"->ipython[all]->nemo-toolkit==0.9.0) (0.6.0)
Requirement already satisfied: jupyter-client in /home/nvidia/.local/lib/python3.6/site-packages (from ipyparallel; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (5.3.4)
Requirement already satisfied: pyzmq>=13 in /home/nvidia/.local/lib/python3.6/site-packages (from ipyparallel; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (18.1.1)
Requirement already satisfied: jsonschema!=2.5.0,>=2.4 in /home/nvidia/.local/lib/python3.6/site-packages (from nbformat; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (3.2.0)
Requirement already satisfied: jupyter-core in /home/nvidia/.local/lib/python3.6/site-packages (from nbformat; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (4.6.2)
Requirement already satisfied: defusedxml in /home/nvidia/.local/lib/python3.6/site-packages (from nbconvert; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (0.6.0)
Requirement already satisfied: mistune<2,>=0.8.1 in /home/nvidia/.local/lib/python3.6/site-packages (from nbconvert; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (0.8.4)
Requirement already satisfied: bleach in /home/nvidia/.local/lib/python3.6/site-packages (from nbconvert; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (3.1.0)
Requirement already satisfied: entrypoints>=0.2.2 in /home/nvidia/.local/lib/python3.6/site-packages (from nbconvert; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (0.3)
Requirement already satisfied: pandocfilters>=1.4.1 in /home/nvidia/.local/lib/python3.6/site-packages (from nbconvert; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (1.4.2)
Requirement already satisfied: Send2Trash in /home/nvidia/.local/lib/python3.6/site-packages (from notebook; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (1.5.0)
Requirement already satisfied: terminado>=0.8.1 in /home/nvidia/.local/lib/python3.6/site-packages (from notebook; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (0.8.3)
Requirement already satisfied: prometheus-client in /home/nvidia/.local/lib/python3.6/site-packages (from notebook; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (0.7.1)
Requirement already satisfied: snowballstemmer>=1.1 in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (2.0.0)
Requirement already satisfied: sphinxcontrib-qthelp in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (1.0.2)
Requirement already satisfied: sphinxcontrib-devhelp in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (1.0.1)
Requirement already satisfied: sphinxcontrib-applehelp in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (1.0.1)
Requirement already satisfied: alabaster<0.8,>=0.7 in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (0.7.12)
Requirement already satisfied: imagesize in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (1.2.0)
Requirement already satisfied: babel!=2.0,>=1.3 in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (2.8.0)
Requirement already satisfied: sphinxcontrib-jsmath in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (1.0.1)
Requirement already satisfied: sphinxcontrib-serializinghtml in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (1.1.3)
Requirement already satisfied: sphinxcontrib-htmlhelp in /home/nvidia/.local/lib/python3.6/site-packages (from Sphinx>=1.3; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (1.0.2)
Requirement already satisfied: widgetsnbextension~=3.5.0 in /home/nvidia/.local/lib/python3.6/site-packages (from ipywidgets; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (3.5.1)
Requirement already satisfied: MarkupSafe>=0.23 in /usr/lib/python3/dist-packages (from jinja2>=2.10->jupyterlab->nemo-toolkit==0.9.0) (1.0)
Requirement already satisfied: json5 in /home/nvidia/.local/lib/python3.6/site-packages (from jupyterlab-server~=1.0.0->jupyterlab->nemo-toolkit==0.9.0) (0.9.1)
Requirement already satisfied: llvmlite<0.32.0,>=0.31.0dev0 in /home/nvidia/.local/lib/python3.6/site-packages (from numba>=0.43.0->librosa->nemo-toolkit==0.9.0) (0.31.0.dev0+1.g7c14ef0)
Requirement already satisfied: pip in /home/nvidia/.local/lib/python3.6/site-packages (from pip-api; extra == "requirements"->isort[requirements]->nemo-toolkit==0.9.0) (20.0.2)
Collecting yarg
  Using cached yarg-0.1.9-py2.py3-none-any.whl (19 kB)
Processing /home/nvidia/.cache/pip/wheels/c6/6b/83/2608afaa57ecfb0a66ac89191a8d9bad71c62ca55ee499c2d0/pycparser-2.19-py2.py3-none-any.whl
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.6/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard->nemo-toolkit==0.9.0) (3.1.0)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.6/dist-packages (from pyasn1-modules>=0.2.1->google-auth<2,>=1.6.3->tensorboard->nemo-toolkit==0.9.0) (0.4.8)
Requirement already satisfied: pyrsistent>=0.14.0 in /home/nvidia/.local/lib/python3.6/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (0.15.7)
Requirement already satisfied: webencodings in /usr/lib/python3/dist-packages (from bleach->nbconvert; extra == "all"->ipython[all]->nemo-toolkit==0.9.0) (0.5)
Installing collected packages: numba, pycparser, cffi, soundfile, resampy, librosa, marshmallow, kiwisolver, cycler, matplotlib, nltk, docopt, num2words, progressbar, ruamel.yaml.clib, ruamel.yaml, sox, unidecode, Click, youtokentome, parameterized, py, pluggy, more-itertools, pytest, pytest-runner, appdirs, toml, typed-ast, regex, pathspec, black, pip-api, yarg, pipreqs, isort, torch-stft, torchtext, filelock, sacremoses, tokenizers, transformers, pypinyin, nemo-toolkit
  Running setup.py develop for nemo-toolkit
Successfully installed Click-7.0 appdirs-1.4.3 black-19.10b0 cffi-1.14.0 cycler-0.10.0 docopt-0.6.2 filelock-3.0.12 isort-4.3.21 kiwisolver-1.1.0 librosa-0.7.2 marshmallow-3.4.0 matplotlib-3.1.3 more-itertools-8.2.0 nemo-toolkit nltk-3.4.5 num2words-0.5.10 numba-0.48.0 parameterized-0.7.1 pathspec-0.7.0 pip-api-0.0.13 pipreqs-0.4.10 pluggy-0.13.1 progressbar-2.5 py-1.8.1 pycparser-2.19 pypinyin-0.37.0 pytest-5.3.5 pytest-runner-5.2 regex-2020.1.8 resampy-0.2.2 ruamel.yaml-0.16.10 ruamel.yaml.clib-0.2.0 sacremoses-0.0.38 soundfile-0.10.3.post1 sox-1.3.7 tokenizers-0.0.11 toml-0.10.0 torch-stft-0.1.4 torchtext-0.5.0 transformers-2.4.1 typed-ast-1.4.1 unidecode-1.1.1 yarg-0.1.9 youtokentome-1.0.6
All done!
python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nemo
>>>
~/NeMo$ python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nemo
>>> import torch
>>> import amp_C
>>>

Apex also seems to function:

python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nemo
>>> import torch
>>> import amp_C
>>> import argparse
>>> import os
>>> from apex import amp
>>> # FOR DISTRIBUTED: (can also use torch.nn.parallel.DistributedDataParallel instead)
... from apex.parallel import DistributedDataParallel
>>> 
>>> parser = argparse.ArgumentParser()
>>> # FOR DISTRIBUTED:  Parse for the local_rank argument, which will be supplied
... # automatically by torch.distributed.launch.
... parser.add_argument("--local_rank", default=0, type=int)
_StoreAction(option_strings=['--local_rank'], dest='local_rank', nargs=None, const=None, default=0, type=<class 'int'>, choices=None, help=None, metavar=None)
>>> args = parser.parse_args()
>>> 
>>> # FOR DISTRIBUTED:  If we are running under torch.distributed.launch,
... # the 'WORLD_SIZE' environment variable will also be set automatically.
... args.distributed = False
>>> if 'WORLD_SIZE' in os.environ:
...     args.distributed = int(os.environ['WORLD_SIZE']) > 1
... 
>>> if args.distributed:
...     # FOR DISTRIBUTED:  Set the device according to local_rank.
...     torch.cuda.set_device(args.local_rank)
... 
>>>     # FOR DISTRIBUTED:  Initialize the backend.  torch.distributed.launch will provide
...     # environment variables, and requires that you use init_method=`env://`.
...     torch.distributed.init_process_group(backend='nccl',
  File "<stdin>", line 3
    torch.distributed.init_process_group(backend='nccl',
    ^
IndentationError: unexpected indent
>>>                                          init_method='env://')
  File "<stdin>", line 1
    init_method='env://')
    ^
IndentationError: unexpected indent
>>> 
>>> torch.backends.cudnn.benchmark = True
>>> 
>>> N, D_in, D_out = 64, 1024, 16
>>> 
>>> # Each process receives its own batch of "fake input data" and "fake target data."
... # The "training loop" in each process just uses this fake batch over and over.
... # https://github.com/NVIDIA/apex/tree/master/examples/imagenet provides a more realistic
... # example of distributed data sampling for both training and validation.
... x = torch.randn(N, D_in, device='cuda')

>>> y = torch.randn(N, D_out, device='cuda')
>>> 
>>> model = torch.nn.Linear(D_in, D_out).cuda()
>>> optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
>>> 
>>> model, optimizer = amp.initialize(model, optimizer, opt_level="O1")
Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
>>> 
>>> if args.distributed:
...     # FOR DISTRIBUTED:  After amp.initialize, wrap the model with
...     # apex.parallel.DistributedDataParallel.
...     model = DistributedDataParallel(model)
...     # torch.nn.parallel.DistributedDataParallel is also fine, with some added args:
...     # model = torch.nn.parallel.DistributedDataParallel(model,
...     #                                                   device_ids=[args.local_rank],
...     #                                                   output_device=args.local_rank)
... 
>>> loss_fn = torch.nn.MSELoss()
>>> 
>>> for t in range(500):
...     optimizer.zero_grad()
...     y_pred = model(x)
...     loss = loss_fn(y_pred, y)
...     with amp.scale_loss(loss, optimizer) as scaled_loss:
...         scaled_loss.backward()
...     optimizer.step()
... 
>>> if args.local_rank == 0:
...     print("final loss = ", loss)
... 
final loss =  tensor(0.2006, device='cuda:0', grad_fn=<MseLossBackward>)
>>> 
>>> 
>>> 
>>>
~/NeMo/examples/start_here$ python3 simplest_example.py
[NeMo W 2020-02-14 08:28:16 deprecated:66] Function ``_get_trainer`` is deprecated. It is going to be removed in the future version.
[NeMo I 2020-02-14 08:28:18 callbacks:177] Starting .....
[NeMo I 2020-02-14 08:28:18 callbacks:188] Starting epoch 0
[NeMo I 2020-02-14 08:28:19 callbacks:212] Step: 0
[NeMo I 2020-02-14 08:28:19 simplest_example:23] Train Loss: 137.61636352539062
[NeMo I 2020-02-14 08:28:19 callbacks:227] Step time: 0.521796464920044 seconds
[NeMo I 2020-02-14 08:28:19 callbacks:212] Step: 25
[NeMo I 2020-02-14 08:28:19 simplest_example:23] Train Loss: 5.390845775604248
[NeMo I 2020-02-14 08:28:19 callbacks:227] Step time: 0.0047686100006103516 seconds
[NeMo I 2020-02-14 08:28:20 callbacks:212] Step: 50
[NeMo I 2020-02-14 08:28:20 simplest_example:23] Train Loss: 0.513081431388855
[NeMo I 2020-02-14 08:28:20 callbacks:227] Step time: 0.0037953853607177734 seconds
[NeMo I 2020-02-14 08:28:20 callbacks:212] Step: 75
[NeMo I 2020-02-14 08:28:20 simplest_example:23] Train Loss: 0.14851737022399902
[NeMo I 2020-02-14 08:28:20 callbacks:227] Step time: 0.004103422164916992 seconds
[NeMo I 2020-02-14 08:28:20 callbacks:195] Finished epoch 0 in 1.2237584590911865
[NeMo I 2020-02-14 08:28:20 callbacks:188] Starting epoch 1
[NeMo I 2020-02-14 08:28:20 callbacks:212] Step: 100
[NeMo I 2020-02-14 08:28:20 simplest_example:23] Train Loss: 0.09763915836811066
[NeMo I 2020-02-14 08:28:20 callbacks:227] Step time: 0.005318403244018555 seconds
[NeMo I 2020-02-14 08:28:20 callbacks:212] Step: 125
[NeMo I 2020-02-14 08:28:20 simplest_example:23] Train Loss: 0.06433030962944031
[NeMo I 2020-02-14 08:28:20 callbacks:227] Step time: 0.0040569305419921875 seconds
[NeMo I 2020-02-14 08:28:20 callbacks:212] Step: 150
[NeMo I 2020-02-14 08:28:20 simplest_example:23] Train Loss: 0.07112766802310944
[NeMo I 2020-02-14 08:28:20 callbacks:227] Step time: 0.003978252410888672 seconds
[NeMo I 2020-02-14 08:28:20 callbacks:195] Finished epoch 1 in 0.5387179851531982
[NeMo I 2020-02-14 08:28:20 callbacks:188] Starting epoch 2
[NeMo I 2020-02-14 08:28:20 callbacks:212] Step: 175
[NeMo I 2020-02-14 08:28:20 simplest_example:23] Train Loss: 0.05760573223233223
[NeMo I 2020-02-14 08:28:20 callbacks:227] Step time: 0.004065036773681641 seconds
[NeMo I 2020-02-14 08:28:21 callbacks:212] Step: 200
[NeMo I 2020-02-14 08:28:21 simplest_example:23] Train Loss: 0.04851408302783966
[NeMo I 2020-02-14 08:28:21 callbacks:227] Step time: 0.0040225982666015625 seconds
[NeMo I 2020-02-14 08:28:21 callbacks:212] Step: 225
[NeMo I 2020-02-14 08:28:21 simplest_example:23] Train Loss: 0.046431295573711395
[NeMo I 2020-02-14 08:28:21 callbacks:227] Step time: 0.004041910171508789 seconds
[NeMo I 2020-02-14 08:28:21 callbacks:195] Finished epoch 2 in 0.5276029109954834
[NeMo I 2020-02-14 08:28:21 callbacks:184] Done in 2.291908025741577

Thanks for providing this build recipe. Unfortunately my attempts to install Nemo fail at pip-install of package ā€œtokenizers 0.0.11ā€ with some error from rustc/cargo (these are tools used internally for building tokenizers).

May I ask you to run ā€œrustc -Vā€ and ā€œcargo -Vā€ in your environment and provide me the output? Thanks in advance.

[rust] it needs first to be installed and then path to it needs to be maintained;

Thanks for the reply. I already have Rust/Cargo installed. As I told the Rust/Cargo compiler fails in the middle of the build process. After some googling I suspect some version incompatibilty. I have Rust/Cargo version 1.37.0.

rustc -V
rustc 1.41.0 (5e1a79984 2020-01-27)

how did you end up getting 1.37 version?

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

from Install Rust - Rust Programming Language

I have installed Rust/Cargo from the apt package. Uninstalled them, then ran Rustup, now tokenizers builds without errors. Thanks for the help.

Yet I wasnā€™t able to build onnxruntime-gpu whl with cuda support cudnn and tensorrt, but just onnxruntime CPU, but folks reported it to work for jetson nano , and tx2. Thus it should be a matter of time to sort out cuda support on Xavier, in my opinion

I have build a onnx-runtime wheel with GPU/Cuda and TensorRT support for Xavier: Release onnxruntime GPU TensorRT Ā· domcross/Jetson-Xavier-AGX-stuff Ā· GitHub

Thank you for sharing!

did you use the command like the below to build the wheel?

/build.sh --config Release --update --build --build_wheel --use_tensorrt --cuda_home /usr/local/cuda --cudnn_home /usr/lib/aarch64-linux-gnu --tensorrt_home /usr/lib/aarch64-linux-gnu

Unfortunately I did not document how I have done the wheel-build. I remember following the CUDA section of the build instructions.

So most likely it was something like

./build.sh --use_cuda --cudnn_home <cudnn home path> --cuda_home <cuda home path> --build_wheel

Donā€™t remember if I have added ohter flags like --config, --update or --build etc.

Thanks!

Thank you both for this discussion. Iā€™m making progress a little at a time. Iā€™m trying to install NeMo in Xavier/Jetpack. Failure is occuring when I run NeMo/reinstall.sh. The associated discussions are:

  1. NeMo installation - done; performance to be evaluated - Jetson AGX Xavier - NVIDIA Developer Forums
  2. https://devtalk.nvidia.com/default/topic/1070793/other-tools/nemo-asr-fails-to-build-on-xavier-jetpack/ (I initiated this)

I first got around the onnxruntime issue by removing the requirements.txt entries but was also able to build the wheel at Release onnxruntime GPU TensorRT Ā· domcross/Jetson-Xavier-AGX-stuff Ā· GitHub. Havenā€™t hit any new onnx or the rustc/cargo issue yet but expect to hit rustc/cargo at some point.

My current issue is with sentencepiece. In discussion 1 above, it seems you were able to build sentencepiece. No matter how I try to build it I get this error: FileNotFoundError: [Errno 2] No such file or directory: ā€˜ā€¦/VERSIONā€™

GitHub topics seem to say thereā€™s some python version compatibility problem. I think reinstall.sh is running 3.6 but Iā€™m not sure - that is installed. Any way to un-block me on that? Thanks!