Howto access external storage from Baobab

this content is migrated in the user doc here: hpc:storage_on_hpc [eResearch Doc]

If you need to mount an external share (NAS) on Baobab from command line, you can proceed as follow.

Launch dbus:

dbus-launch bash

mount the share, smb in this example:

gio mount smb://server_name/share_name

This will prompt for the user/password/domain. If you are connecting an UNIGE network share such as the NASAC, it’s your ISIS credentials and the domain is ISIS.

The share will be mounted on

/run/user/your_uid/gvfs/

You can access the files using standard POSIX tools such as cp ls etc. If you face an error when accessing a specific file, you can use gio copy as cp replacement which seems to work better. The same for ls etc.

When you don’t need to access the data anymore, you may unmount the share:

gio mount -u smb://server_name/share_name

:warning: The data are only available on the login2 node. If you need to access the data on the nodes, you need to mount them there as well in your sbatch script.

If you need to script this, you can put your credentials in a file in your home directory.

Content example: .credentials.

username
domain
password

Mount using credentials in a script:

gio mount smb://server_name/share_name < .credentials

update: do not use .netrc as filename as this name may conflict with other software like wget and the format is not correct for it.
update: correct example with credentials.
update: add domain name for NASAC
update: use gio copy

1 Like

Does this work also for sftp

I’d like to download a lot of files from http://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/human-gut/2019_09/all_genomes/

I also like to be sure that all files have correctly copied. What is the best way to do that?
mount and rsync?

Yes it works. I tried:

dbus-launch bash
gio mount ftp://ftp.ebi.ac.uk
mkdir db
cd db
rsync -a ~/.gvfs/ftp\:host\=ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/human-gut/2019_09/all_genomes/ .

Please note that the mount point is in your home directory in a directory named .gvfs.

I’m not sure it’s the more efficient way to proceed, but anyway it seems to work.

I got gio: ftp://ftp.ebi.ac.uk: volume doesn’t implement mount

Could you download it and put it in my screen folder?

Did you follow my explanation? Specially the dbus-launch bash command?
and then in the same terminal as you launched dbus, you do the gio mout.

I wasn’t talking about your screen folder, I was instead talking about the screen command. You may as well use tmux… or nothing and do the command directly in your shell. TIMTOWTDI

I created a new screen and copy pasted your commands.
Especially the first one.

I checked your .bashrc, you have a custom PATH and in this path you have another version of gio.

Please try like that:

dbus-launch bash
/usr/bin/gio mount ftp://ftp.ebi.ac.uk

This works now.

I mounted the ftp server. but I don’t know where.

(base) [kiesers@login2 ~]$ /usr/bin/gio mount -l

Drive(0): INTEL SSDPEDMD400G4
Type: GProxyDrive (GProxyVolumeMonitorUDisks2)
Drive(1): INTEL SSDPEDMD400G4
Type: GProxyDrive (GProxyVolumeMonitorUDisks2)
Drive(2): WDC WD1003FBYX-01Y7B0
Type: GProxyDrive (GProxyVolumeMonitorUDisks2)
Drive(3): TSSTcorp CDDVDW SN-208FB
Type: GProxyDrive (GProxyVolumeMonitorUDisks2)
Mount(0): ftp.ebi.ac.uk -> ftp://ftp.ebi.ac.uk/
Type: GDaemonMount

I searched on internet and found that different systems and gio versions have different locations. But I don’t find the mounting point in the ~/.gvfs nor in /run/user/327199/gvfs/

Any idea how to find it?

Hi there,

The process that lets you access GVfs/Gio shares via CLI is called gvfsd-fuse and its first argument is the folder where GVfs/Gio shares are exposed to.

You can find such folder with the following command:

ps ux | grep -e '[g]vfsd-fuse'

However, your default environment is not clean at all, given that you initialize Conda in your .bashrc , thus as in the case for the gio binary (cf. Howto access external storage from Baobab - #7 by Yann.Sagon ), something else could enter into the process. It is preferable to not mess with the shell startup and instead load everything you need once you are at the shell prompt.

Please try again completely removing the Conda stuff from your .bashrc and come back if the problem is still there.

Thx, bye,
Luca

Hello

I am wondering if I may intervene in the discussion, I have a very related question.
I would like to mount a NAS, for example by FTP (but not necessarily).
The instructions work well on login2 (and I can also use sshfs, for example), but I’ve got troubles on the compute nodes.

I understood from the quote below that it is certainly supposed to work:

but it seems like the nodes (I am trying on node001) do not have dbus-launch.
I can find it in DBus/1.13.8 module, but it complains that XDG_RUNTIME_DIR is not available, and then results in gio mount ftp://isdcarc.unige.ch/arc/rev_3/ reporting volume doesn’t implement mount.

To be clear. the same exact procedure which works on login2 does not work on node001.

What am I doing wrong?

Also, is it the recommended way to mount storages on Baobab? I see there is a collection of seemingly project-specific mounts, is it possible to go this way?

Cheers

Volodymyr

Hi there,

Given how DBus is nowadays tight with the underlying OS, it is better to have it installed via RPMs, done on all nodes.

This should be set by systemd upon login (via SSH too), but it seems that this is not the case on the computational nodes, I need to investigate a bit more.

We usually add project-specific mounts when accessed by several users, we are not going to add a project-specific mount for single users, sorry.

Thx, bye,
Luca

1 Like

Hi Luca,

Thanks!

How should I best follow this issue, in this thread is enough, it works for a ticket system?

Sure, I see.
What are the requirements for the project to be adopted? How many users would justify the mount?
I am part of two projects, and would speak with project PI’s to more broadly use the cluster (and hence bring more users), but it’s useful to know first what is feasible.

Is it in principle possible to mount NFS from some hosts in the Department of Astronomy? Performance will not suffer too much? It would be mounted to all nodes?

Thanks

Volodymyr

Dear Voloymyr,

as we now have a quite big scratch storage, we really prefer that the computation is done on data hosted on the scratch and not on a remote path. A straightforward workflow is to copy your remote data to Baobab’s scratch space and then use it from the compute nodes.

Dear Luca,

This was my immediate plan.
There is a complication with having to frequently synchronize the storage, but I guess it is manageable.

Related issue is what do I do if I want to share data with some users but not all? Can there be project groups? Should I use ACL?

Regards

Volodymyr

Dear,

I experience lot of trouble with “gio” to transfert files from NASAC storage like smb://nasac-m2.unige.ch/ or smb://nasac-evs4.unige.ch/ to ~/scratch.

One important limitation is the impossibility to recursively copy a directory with “gio copy”. Also, if we don’t use “gio copy”, “cp” or “tar” commands systematically fail.

Do you have any usable alternative ?
What would be your suggestion if we need to frequently analyse project and data stored on such external storage ?

Thank you for your help,

Julien

Hi,

cp isn’t well suited to handle network errors and and won’t issue retry etc.

You should be fine using for example rsync.

Hi @Luca.Capello, I found out I had the same issue as @Silas.Kieser .
My gio command was pointing to the one in anaconda, so using /usr/bin/gio allowed me to fix the “doesn’t implement mount” issue.
Now I can mount the server, but I can’t find where it is. the ps + grep command return the path, but it is empty. There is also no directory ~/.gvfs. I cleaned my .bashrc from the anaconda initialization as you suggested but the gvfs directory is still empty. What am I missing?

(base) [savardg@login1.yggdrasil ~]$ nano ~/.bashrc
(base) [savardg@login1.yggdrasil ~]$ source ~/.bashrc
[savardg@login1.yggdrasil ~]$ dbus-launch bash
[savardg@login1.yggdrasil ~]$ /usr/bin/gio mount smb://scinas-s-detco.unige.ch/s-detco-crude/Share
Password required for share s-detco-crude on scinas-s-detco.unige.ch
User [savardg]: 
Domain [SAMBA]: ISIS
Password: 
[savardg@login1.yggdrasil ~]$ ls /run/user/398211/gvfs
[savardg@login1.yggdrasil ~]$ ps ux | grep -e '[g]vfsd-fuse'
savardg   60377  0.0  0.0 454828   280 ?        Sl   Jan31   0:00 /usr/libexec/gvfsd-fuse /run/user/398211/gvfs -f -o big_writes

Hi @Genevieve.Savard

It seems there are old processes from January 31. Could you kill/terminate all processes a try again following the procedure ?

(yggdrasil)-[root@login1 ~]$ ps aux | grep dbus | grep savar
savardg   59942  0.0  0.0   8936     0 ?        S    Jan31   0:00 /home/users/s/savardg/anaconda3/bin/dbus-run-session /etc/x2go/Xsession
savardg   59943  0.0  0.0  15972   772 ?        S    Jan31   0:06 dbus-daemon --nofork --print-address 4 --session
savardg   60941  0.0  0.0  60060   148 ?        S    Jan31   0:00 /usr/bin/dbus-daemon --config-file=/usr/share/defaults/at-spi2/accessibility.conf --nofork --print-address 3
savardg   91860  0.0  0.0  60060  1456 ?        Ss   May10   0:00 /usr/bin/dbus-daemon --fork --print-pid 4 --print-address 6 --session
savardg  252892  0.0  0.0  15628   256 ?        Ss   May06   0:00 /home/users/s/savardg/anaconda3/bin/dbus-daemon --syslog --fork --print-pid 4 --print-address 6 --session
savardg  264062  0.0  0.0  15604   892 ?        Ss   May10   0:00 /home/users/s/savardg/anaconda3/envs/noisepy/bin/dbus-daemon --syslog --fork --print-pid 4 --print-address 6 --session
1 Like

That was the culprit! now it works, thanks! :slight_smile:
No idea why this Jan 31 process was still running…