Mounting nasac on baobab: dbus-launch bash fails

I tried to mount a new nasac share on baoab, and following instructions in hpc:storage_on_hpc [eResearch Doc],
I run:

$dbus-launch bash

but this triggers the error message:

dbus-daemon[154590]: Failed to start message bus: Failed to bind socket “/tmp/eb-240s9w15/dbus-BhXJEhGkTK”: No such file or directory
EOF in dbus-launch reading address from bus daemon

This is happening on a cluster node on baobab.

Also, the documentation is not very clear on smb://server_name/share_name. Could you give a typical example in the documentation? I asked for a standard NASAC share and got J'ai créé le partage \isis.unige.ch\nasac\gsem\name... what is server_nameandshare_name` here?

Thanks!

Dear @Matthieu.Stigler

this is unfortunately a know issue [2024] Current issues on HPC Cluster - #15 by Adrien.Albert still not solved.

If you want to copy the data from NASAC to the cluster, you may want to try to mount the cluster storage in your desktop/server and copy from there. Not very efficient I must admit:(

Best

Yann

Hello @Matthieu.Stigler

To briefly summarize, the issue is caused by a Mellanox driver update that disables the CIFS kernel module required for mounting smb share.

Here the mention in officiel Mellanox known issues 2657392:

Description: OFED installation caused CIFS to break in RHEL 8.4 and above. A dummy module was added so that CIFS would be disabled after the OFED installation in RHEL 8.4 and above.

We tried to apply a workaround based on the very limited information from Mellanox-NVIDIA, but it didn’t work. Mellanox-NVIDIA support has been mostly quiet on this issue.

I’m currently working on another clue, but I can’t guarantee a solution, as dealing with the kernel can be quite tricky.

Thanks for following-up on this, I really appreciate! Hope a workaround can be found and especially that Mellanox responds to this!

Hi @Matthieu.Stigler

Few chance to get an answer from mellanox as they assume that is not their scope…

But the good news is :tada: mywork arround is working, we need to test the robustness of this patch.

If you are interested in testing this patch, please send an email to HPC.

PS: Note that the module has been deactivated by mellanox for a good reason, so we’re not immune to unexpected behaviour.

Hi @Adrien.Albert,

Thanks for the update in addressing this pernicious issue.
That’s good news for all users of the HPC who also depend upon the NASAC for sharing data across our teams.
I’d like to try your patch, but it’s not clear which HPC email address you would like us to write to (hpc or hpc-admin…?)

More generally - Given this appears to be an essential functionality, but you say it has been disabled by Mellanox for good reason, what alternative best practice can you propose for sharing data sustainably?

Thanks in advance!
Alexis

Hi All

here the new about the work arround :wink:

great, thanks a lot Adrien!

were you able to deploy the patch on the on demand docker images too, or is that still an issue? Thanks!

As the patch could be unstable, we have limited its deployment to the login node to ensure there is no impact on production.

I tried some basic adjustments, but unfortunately they were not successful.

As a workaround has been implemented, we won’t be exploring this fix any further. While I understand the convenience of directly mounting the share in Singularity images on compute nodes, the time investment required to avoid a simple data copy on the cluster is too significant. Furthermore, I was unable to find any relevant documentation on this approach.

As an alternative, we’ve discussed converting your CIFS share to NFS. This would allow us to easily mount your share across all compute nodes. Have you see with the storage team which is the best option?

Hello @Matthieu.Stigler

Good News! :tada:

Dear HPC Team,

We are trying to mount the NASAC on bamboo using the same procedure we are using on baobab. The procedure is working on baobab and we can access our data in ~/.gvfs/ ), but we are unable to localise the mount point when the procedure is repeated on bamboo.

More specifically, we are using these commands:

dbus-launch bash
gio mount 'smb://ISIS;prados@nasac-m2.unige.ch/m-BioinfoSupport'

The authentification process and mounting process seems to went well if I refer to running processes:

(bamboo)-[prados@login1 ~]$ ps -u prados
    PID TTY          TIME CMD
3749519 ?        00:00:00 sshd
3749520 pts/13   00:00:00 bash
3764679 pts/13   00:00:00 bash
3764683 ?        00:00:00 dbus-daemon
3764753 ?        00:00:00 gvfsd
3764760 ?        00:00:00 gvfs-udisks2-vo
3764768 ?        00:00:00 gvfsd-smb
3765448 pts/13   00:00:00 ps

However we are not able to find the mount point ! The files are note accessible in usual locations (~/.gvfs, and /var/run/user/), and the mount point is not listed by findmnt.

Could you please help us to localise the mount point ?

Thanks a lot,
Julien

(bamboo)-[alberta@login1 ~]$ dbus-launch bash
(bamboo)-[alberta@login1 ~]$ gio  mount   smb://isis.unige.ch/nasac/hpc_exchange/backup < .credentials 
Authentication Required
Enter user and password for share “nasac” on “isis.unige.ch”:
User [alberta]: Domain [SAMBA]: Password: 

(bamboo)-[alberta@login1 ~]$ gio mount --list
Drive(0): SAMSUNG MZ7L3480HBLT-00A07
  Type: GProxyDrive (GProxyVolumeMonitorUDisks2)
Drive(1): SAMSUNG MZ7L3480HBLT-00A07
  Type: GProxyDrive (GProxyVolumeMonitorUDisks2)
Mount(0): nasac on isis.unige.ch -> smb://isis.unige.ch/nasac/
  Type: GDaemonMount

Following the documentation: hpc:storage_on_hpc [eResearch Doc]

(bamboo)-[alberta@login1 ~]$  ps ux | grep -e '[g]vfsd-fuse'
alberta   526981  0.0  0.0 349216  2048 ?        Sl   Dec05   0:00 /usr/libexec/gvfsd-fuse /run/user/401775/gvfs -f

(bamboo)-[alberta@login1 ~]$ ls /run/user/401775/gvfs/smb-share\:server\=isis.unige.ch\,share\=nasac/hpc_exchange/backup/
titi  toto


With gio command:

(bamboo)-[alberta@login1 ~]$ gio mount --list smb://isis.unige.ch/nasac/ -i
[...]
Mount(0): nasac on isis.unige.ch -> smb://isis.unige.ch/nasac/
  Type: GDaemonMount
  default_location=smb://isis.unige.ch/nasac/hpc_exchange/backup <=== here the relevant information about mount point
  themed icons:  [folder-remote]  [folder]  [folder-remote-symbolic]  [folder-symbolic]
  symbolic themed icons:  [folder-remote-symbolic]  [folder-symbolic]  [folder-remote]  [folder]
  can_unmount=1
  can_eject=0
  is_shadowed=0

One more things to check: be sure to use the correct gio and not the one provided for example by anaconda.

(bamboo)-[root@login1 ~]$ which gio
/usr/bin/gio

Dear all,
Now that bamboo and baobab have been updated, I observe the same behaviour on both.

(baobab)-[delislel@login1 ~]$ gio mount smb://nasac-m2.unige.ch/m-AndreyLab    
Authentication Required
Enter user and password for share “m-andreylab” on “nasac-m2.unige.ch”:
User [delislel]: 
Domain [SAMBA]: ISIS
Password: 
(baobab)-[delislel@login1 ~]$ ls .gvfs
ls: cannot access '.gvfs': No such file or directory
(baobab)-[delislel@login1 ~]$ ls /run/user/
240477

I cannot find where it is mounted but I see what is inside:

(baobab)-[delislel@login1 ~]$ gio list smb://nasac-m2.unige.ch/m-andreylab
#SHARE

With another nasac I can even go in subdirectories:

(baobab)-[delislel@login1 ~]$ gio list smb://nasac-m2.unige.ch/m-gherrera/GHerrera
directory1
directory2
LucilleDelisle

And I know that I can copy with:

(baobab)-[delislel@login1 ~]$ gio copy smb://nasac-m2.unige.ch/m-gherrera/LucilleDelisle/xxx/xxx/results_20241211/reports/X9_report-cutadapt.txt ./

But for the ‘#SHARE’ gio list is doing very strange thing:

(baobab)-[delislel@login1 ~]$  gio list smb://nasac-m2.unige.ch/m-andreylab/\#SHARE/\#SHARE/\#SHARE/\#SHARE/
#SHARE

Do you have a solution either to have a real mount point or at least to be able to list files where the directory is #SHARE?

Thanks

Hi @Lucille.Delisle1

Indeed due to the maintenance on Baobab the workarround was no longer effective. Can you try again?

also, did you check:

ls /run//user/${UID}/gvfs

I restarted from scratch:

(baobab)-[delislel@login1 ~]$ dbus-launch bash
(baobab)-[delislel@login1 ~]$ gio mount smb://nasac-m2.unige.ch/m-gherrera
Authentication Required
Enter user and password for share “m-gherrera” on “nasac-m2.unige.ch”:
User [delislel]: 
Domain [SAMBA]: ISIS
Password: 
(baobab)-[delislel@login1 ~]$ ls -alh .gvfs
ls: cannot access '.gvfs': No such file or directory
(baobab)-[delislel@login1 ~]$ ls -alh /run/user/$UID/gvfs
ls: cannot access '/run/user/313457/gvfs': No such file or directory
(baobab)-[delislel@login1 ~]$ 

All results are the same.

Hi @Lucille.Delisle1

Could you give the outpout of the following command:

$ echo $XDG_RUNTIME_DIR

If empty set the variable to:

 $ export XDG_RUNTIME_DIR=/run/user/$(id -u)

It seems the issue is only occuring on login node

Login node:
-------------
(baobab)-[alberta@login1 ~]$ dbus-launch bash
dbus[1533946]: Unable to set up transient service directory: XDG_RUNTIME_DIR "/run/user/401775" not available: No such file or directory
(baobab)-[alberta@login1 ~]$ 
exit

Compute node:
-------------
(baobab)-[alberta@login1 ~]$ salloc
salloc: Granted job allocation 15269387
salloc: Nodes cpu001 are ready for job
(baobab)-[alberta@cpu001 ~]$ dbus-launch bash
(baobab)-[alberta@cpu001 ~]$ gio mount smb://isis.unige.ch/nasac/faculty/alberta-smb
Authentication Required
Enter user and password for share “nasac” on “isis.unige.ch”:
User [alberta]: 
Domain [SAMBA]: ISIS
Password: 
(baobab)-[alberta@cpu001 ~]$ ls /run/user/401775/gvfs
'smb-share:server=isis.unige.ch,share=nasac'

XDG_RUNTIME_DIR does not initialise correctly on the login node.

XDG_RUNTIME_DIR "/run/user/401775" not available: No such file or directory