Dear HPC team,
Recently I have had problems with jobs on baobab running from the home storage.
My jobs run for some time without issues, but then randomly seem to stop, so apbruptly that both the log files of the job itself and the slurm out file end without any information as to why the job stopped.
This results in the job not writing required files, needed to continue the job.
Further dependency jobs within the same directory then fail within one second without generating a slurm out file at all.
Could you have some info/help with this please?
This has me stumped, as I can not even debug whether this is due to the content/commands within the job or a cluster side issue.
Example job would be: slurm-1098047.out
At: /home/users/h/hankea/folding_bh3/oneopes
Thank you for your help!
Best wishes,
Anton Hanke