"Folder does not exist or is not reachable" on DIGITS AMI

Hi,

I have an instance of Ubuntu 16.04 running DIGITS 6.
I have downloaded and successfully expanded the MS-COCO dataset as detailed in GitHub - dusty-nv/jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson..

When I run DIGITS and enter the paths to reach the training and validation data folders, I get the error in DIGITS “Folder does not exist or is not reachable”.

  1. I have checked the paths are correct and tried variants ( from root, home etc).
  2. I have followed instructions to make sure that www-data has read access. Guidance from : https://github.com/NVIDIA/DIGITS/issues/1026 and also from permissions - How do I give www-data user to a folder in my home folder? - Ask Ubuntu.

All help most welcome! Thanks!

Mike

Is this the DIGITS AMI, or is it the NGC AMI running the the NGC DIGITS container? (The two are somewhat different.)

It’s the AWS AMI for DIGITS 6 - so I suspect the first of the two options above. It is selected in AWS from EC2 → AWS Marketplace → NVIDIA DIGITS 6 on Ubuntu 16.04

Is there a better place I can be doing this?

I am trying to get Object Detection working on my TX2, but do not have a suitable machine for building the models - hence working in the cloud to get DIGITS operational.

Thanks for the response!

I have now also looked again at the NGC docker containers at GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC and I’m afraid I find the instructions completely baffling.

For example, mention is made of an ‘Actions’ column, but I don’t think there is one.

It also says use ‘docker pull’, without explaining how to do that.

Apologies for being slow!

In fact, it was orignally from this page that I opted to follow the links to AWS…

If I can be doing this a better way, please let me know, as I am spending a great deal of time trying to get a simple build environment working.

Cheers, Mike

Hi Cliff,

I am working through instructions for the NGC container now…

However, I’m still hoping to be able to get the AWS AMI image working - no resaon it shouldn’t?

Thanks for your help.

Cheers

Mike

Ah, thanks for pointing out about the “actions” - looks like they renamed that column in the NGC UI to “Pull” after we wrote the text of the README. We’ll get that updated.

Beyond that, the instructions in the NGC README - and this forum, generally - assume you’re using the NGC AMI, not the DIGITS AMI. (I was working on finding you an alternative place to ask the question in the context of the DIGITS AMI before suggesting you start over in a different AMI.) The fact that we have two paths to getting DIGITS on AWS is a bit of a historical artifact we’re looking to reconcile in the future.

From first principles, the main thing will be to know what paths are involved, where and how those volumes are mounted, and what user account DIGITS is running as. I’m seeking answers to at least the latter of those questions internally, but any extra details you could offer as well would be helpful.

Hi Cliff,
Thanks for getting back.

I think the ‘penny had dropped’ on the NGC container approach. However it seems that the Volta instance is 5 times more expensive per hour, which is significant for a self-funded learner when getting started.

So as I ssh in, the prompt is $ ubuntu@ip-privateipaddress.

Following Dusty’s instructions, a simple ls from here shows the directory ‘coco’ seems correctly listed and drilling down, all the subdirectories and data apper to be present. i.e.

Training image folder: coco/train/images/dog
Training label folder: coco/train/labels/dog
Validation image folder: coco/val/images/dog
Validation label folder: coco/val/labels/dog

ubuntu@ip:~$ cd coco
ubuntu@ip-:~/coco$ ls
coco2kitti.py train val
ubuntu@ip-:~/coco$ cd train
ubuntu@ip-:~/coco/train$ ls
images labels
ubuntu@ip-:~/coco/train$ cd images
ubuntu@ip-:~/coco/train/images$ ls
airplane bottle chair dog
ubuntu@ip-:~/coco/train/images$ cd dog
ubuntu@ip-:~/coco/train/images/dog$

Permissions are as follows ( I have reloaded the datasets so permissions have not been modified in any way)

ubuntu@ip-:~/coco$ ls -l
total 12
-rw-rw-r-- 1 ubuntu ubuntu 2512 May 1 2017 coco2kitti.py
drwxrwxr-x 4 ubuntu ubuntu 4096 May 3 2017 train
drwxrwxr-x 4 ubuntu ubuntu 4096 May 2 2017 val

By way of accounts, does this help?
/home$ ls
digits ubuntu

Thanks again,

Mike

So one thought is that I am logged in as ubuntu and installing data owned by that account and in that path. However, maybe the server is accessing using the digits account? This might mean:

  1. the server is looking in the wrong account path.
  2. It doesn’t have permission to see the folders.

Could that be a possibility?

There are two possible issues at play here. One or both might be at play depending on which AMI you’re using.

(1) You’ve got your dataset in /home/ubuntu/coco . However, /home/ubuntu (and therefore subdirectories of the same) would normally only be accessible to the ‘ubuntu’ user, and DIGITS – whether containerized or not – usually runs as a different user, leading to possible permissions problems accessing the dataset. The solution is to move the dataset to some shared directory that all users on the system can access.

(2) When running in a containerized environment, the dataset is probably mounted at different paths as viewed from outside the container (say it’s /host/path/to/coco ) versus inside the container (normally just /data for DIGITS containers). It’s the -v argument to docker run that configures this; for example, if you’re doing something like docker run … -v /host/path/to/coco:/data …, then when you go into the DIGITS UI and set up the dataset, the path you use to tell DIGITS where to find the dataset needs to be /data, not /host/path/to/coco.

Hope this helps,
Cliff

Hi Cliff,

I have managed to get it running through the combination of actions as suggested.

Thanks for your help!

Mike

Glad to hear it!

Hi! My question is almost exactly the same as the one above.

I think the problem probably has to do with giving www-data read access, but I’m having trouble figuring out the appropriate combination of commands to do so with my directory structure. I am logged in as ubuntu, and I am trying to put the ‘coco’ directory into /home/digits/data. Here are the current permissions:

ubuntu@ip:/home/digits$ ls -la
total 28
drwxr-xr-x 4 digits digits 4096 Oct 2 22:34 .
drwxr-xr-x 4 root root 4096 Oct 2 22:34 …
-rw-r–r-- 1 digits digits 220 Oct 2 22:34 .bash_logout
-rw-r–r-- 1 digits digits 3771 Oct 2 22:34 .bashrc
drwxrwx— 3 digits digits 4096 Nov 24 19:43 data
drwxr-xr-x 2 digits digits 4096 Nov 8 03:33 jobs
-rw-r–r-- 1 digits digits 655 Oct 2 22:34 .profile

Should I be trying to put the ‘coco’ directory into /data instead? Any help would be much appreciated!

Hi pineapple_flora,

So firstly move all the data into the digits/data folder. home/digits/data/coco/…

In DIGITS, simply specify this path as /data/coco…

Secondly set permissions on the entire data folder (recursively) to be readable for all. This will also allow www.data access.

If you are following the dusty-nv instructions, you may also find the network in prototext will not work with the specified dataset. I learnt that after 14 hours of build time, only to find the model hadn’t been learning anything. That’s what I’m working on now…

Hi mikeisted,

Thanks for the suggestions! I followed them and am not seeing a “Folder does not exist or is not reachable” error any more, hooray!

I was able to get a little further in the dusty-nv tutorial, but I ran into another roadblock when trying to launch the DetectNet-COCO-Dog model. At a high level, the error that I’m seeing is: “ERROR: Check failed: error == cudaSuccess (8 vs. 0) invalid device function.” It might make sense for me to start another thread about this topic or to search through the issues at dusty-nv/jetson-inference – I was just curious if you had experienced anything similar with your DIGITS 6 AMI.

invalid device function usually means that the architecture a CUDA code was compiled for does not match the architecture of the GPU you are running on.

What sort of AWS instance type did you launch the DIGITS AMI on? And for clarity are we talking about NGC or not NGC?

NGC: GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC

Not NGC: AWS Marketplace → NVIDIA DIGITS 6

(I asssume this is “not NGC” here)

Hello txbob,

Thanks for the quick reply! I am using “Not NGC”: AWS Marketplace → NVIDIA DIGITS 6. The AWS instance type is g2.2xlarge.

Here are the first and last several lines of output from caffe_output.log for the job:

[snip]
libdc1394 error: Failed to initialize libdc1394
I1126 11:55:07.177713 579 upgrade_proto.cpp:1044] Attempting to upgrade input file specified using deprecated ‘solver_type’ field (enum)': /jobs/20171126-115505-4756/solver.prototxt
I1126 11:55:07.177882 579 upgrade_proto.cpp:1051] Successfully upgraded file specified using deprecated ‘solver_type’ field (enum) to ‘type’ field (string).
W1126 11:55:07.177889 579 upgrade_proto.cpp:1053] Note that future Caffe releases will only support ‘type’ field (string) for a solver’s type.
I1126 11:55:07.234048 579 caffe.cpp:197] Using GPUs 0
I1126 11:55:07.234225 579 caffe.cpp:202] GPU 0: GRID K520
I1126 11:55:07.730933 579 solver.cpp:48] Initializing solver from parameters:
[/snip]

[snip]
F1126 11:55:09.247619 579 pooling_layer.cu:212] Check failed: error == cudaSuccess (8 vs. 0) invalid device function
*** Check failure stack trace: ***
@ 0x7f205acaddaa (unknown)
@ 0x7f205acadce4 (unknown)
@ 0x7f205acad6e6 (unknown)
@ 0x7f205acb0687 (unknown)
@ 0x7f205b4166a0 caffe::PoolingLayer<>::Forward_gpu()
@ 0x7f205b223358 caffe::Net<>::ForwardFromTo()
@ 0x7f205b2236d7 caffe::Net<>::Forward()
@ 0x7f205b2a4f8a caffe::Solver<>::Test()
@ 0x7f205b2a580e caffe::Solver<>::TestAll()
@ 0x7f205b2a7ddd caffe::Solver<>::Step()
@ 0x7f205b2a879e caffe::Solver<>::Solve()
@ 0x40af75 train()
@ 0x4086bc main
@ 0x7f20597a4f45 (unknown)
@ 0x408e8d (unknown)
@ (nil) (unknown)
[/snip]

Maybe somewhere I have to change a setting to accommodate the GPU architecture; just not sure where to look for it on the AWS DIGITS 6 instance?

Hello,
I have a similar issue.
I recently installed DIGITIS 6 AMI
I’m playing around the KITTI/Detectnet tutorial : DIGITS/examples/object-detection at master · NVIDIA/DIGITS · GitHub

All data has been created in kitti-data:
ubuntu@ip:/home/digits/data/kitti-data$ ls -al
total 24
drwxrwxr-x 6 ubuntu ubuntu 4096 May 12 16:53 .
drwxrwxrwx 3 digits digits 4096 May 12 16:50 …
drwxrwxr-x 7 ubuntu ubuntu 4096 May 12 16:53 raw
drwxrwxr-x 4 ubuntu ubuntu 4096 May 12 16:53 train
drwxrwxr-x 4 ubuntu ubuntu 4096 May 12 16:53 val
drwxrwxr-x 4 ubuntu ubuntu 4096 May 12 16:53 video-split

When I create a new dataset I use the following path (as suggested above) /data/kitti-data/train/images
Folders seem to be recognized but then I get the following error
libdc1394 error: Failed to initialize libdc1394
2018-05-12 17:03:00 [ERROR] IOError:
Traceback (most recent call last):
File “/usr/local/lib/python2.7/dist-packages/digits/tools/create_generic_db.py”, line 478, in
args[‘stage’]
File “/usr/local/lib/python2.7/dist-packages/digits/tools/create_generic_db.py”, line 443, in create_generic_db
force_same_shape)
File “/usr/local/lib/python2.7/dist-packages/digits/tools/create_generic_db.py”, line 296, in create_db
entry_ids = extension.itemize_entries(stage)
File “/usr/local/lib/python2.7/dist-packages/digits/extensions/data/objectDetection/data.py”, line 183, in itemize_entries
self.load_ground_truth(self.train_label_folder)
File “/usr/local/lib/python2.7/dist-packages/digits/extensions/data/objectDetection/data.py”, line 208, in load_ground_truth
datasrc.load_gt_obj()
File “/usr/local/lib/python2.7/dist-packages/digits/extensions/data/objectDetection/utils.py”, line 180, in load_gt_obj
with open(os.path.join(self.label_dir, label_file), ‘rb’) as flabel:
IOError: [Errno 2] No such file or directory: u’/data/kitti-data/train/labels/000815.txt’

I did ubuntu@ip:/home/digits$ sudo chmod -R 777 .

But I still get the same error…

Any help would be very much appreciated.

Thx
P.

I have an instance of Ubuntu 16.04 running DIGITS 6.
I have downloaded and successfully expanded the MS-COCO dataset as detailed in GitHub - dusty-nv/jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson..

When I run DIGITS and enter the paths to reach the training and validation data folders, I get the error in DIGITS “Folder does not exist or is not reachable”. FMovies YesMovies SolarMovie

  1. I have checked the paths are correct and tried variants ( from root, home etc).
  2. I have followed instructions to make sure that www-data has read access. Guidance from : https://github.com/NVIDIA/DIGITS/issues/1026 and also from permissions - How do I give www-data user to a folder in my home folder? - Ask Ubuntu.

All help most welcome! Thanks!

Mike

[/quote]

I am trying to get Object Detection working on my TX2, but do not have a suitable machine for building the models - hence working in the cloud to get DIGITS operational.

I´ve the same issue.
How can I use digits with AWS AMI “Digits 6 on Ubuntu 16.04”?

I got the error Message creating a new dataset: Folder does not exist or is not reachable