Local /share directory beetween jobs on compute

Dear users,

We are pleased to inform you that a new feature has been deployed.

With each Job Prolog, Slurm will create a /share directory on the local compute disk. This directory allows you to read from and write to this space, and it is accessible across all your running jobs on the same node. The /share directory is primarily located on SSD storage, resulting in high IO performance. Please note that this storage is cleaned up during the Job Prolog if there are no other files associated with your account.

The shared directory can be accessed using the following path: USR_DIR=/share/users/${SLURM_JOB_USER:0:1}/${SLURM_JOB_USER}

Here is an example of bash code that transfers data and ensures there are no multiple copies on this shared space. Other running jobs will wait until the copy is completed.

#!/bin/bash

share_dir="/share/users/${SLURM_JOB_USER:0:1}/${SLURM_JOB_USER}"
start_lock="${share_dir}/start_lock"
end_lock="${share_dir}/end_lock"

# Check if copy is already completed
[[ -f "${end_lock}" ]] && exit 0

# Check if copy has started, if not create a lock file
[[ ! -f "${start_lock}" ]] && echo "$SLURM_JOB_ID" > "${start_lock}"

# Check the lock file and if it's the job inside so launch the copy
# to ensure a unique rsync at the same time 
if [[ "$(cat "${start_lock}")" == "$SLURM_JOB_ID" ]]; then
  # copy has not started yet
  # Fill in copy  block code here
  <FIXME>
  echo "$SLURM_JOB_ID" > "${end_lock}"
else
  # rsync has already started => wait until it's done
  while [[ ! -f "${end_lock}" ]]; do
    sleep 60
  done
fi

Best Regards,

3 Likes