We run calculations mainly with Q-Chem, which uses a scratch folder to save temporary files needed for the calculations. These files (the whole folder actually) are deleted after the calculation is done, usually. In a couple of occasions now, it had happened that the files are not found while the calculation is still running giving the following error:
FileMan error: Could not open file FILE_SOL_ENERGY
Path: /home/ricardi/scratch/qchem230284/686.0: Remote I/O error
FileMan error: Could not open file UNKNOWN FILE
Path: /home/ricardi/scratch/qchem230284/10.0: Remote I/O error
rm: No match.
Error: in the serial run
srun: error: cpu088: task 0: Exited with exit code 1
Do you think there was some failure on the scratch space or something? or is it about the flow of information? I suggest this second possible cause because we also just had another problem where the program was using the input of a file with the same name but from a different folder [i.e. path1/hf.in and path2/hf.in], probably running at the same time but in different nodes. [I may create a separate ticket for this if the error persists].
So, I am not sure how much scratch space we need, usually, small calculations like this should take maybe hundreds of MB, so very little.
Yes, we could change the scratch path to use /scratch instead, which I guess would only affect the line:
Your sbatch script is probably incomplete or you override some parameters in command line: missing number of cores, partition, timelimit, etc.
Did you launched more than one qchem job at a time?
I’m asking as you are specifying a non dedicated QCSCRATCH directory which is probably shared with other qchem instances and according to the documentation, qchem cleans this directory at the end of a successful job.
By the way, if you want to use local scratch, it is suggested to set another variable:QCLOCALSCRsee doc.
Yes, the rest of the parameters are given in the command line.
and yes, we run many qchem jobs at the same time.
As you can see on my first message I was aware of this. So you are suggesting using different folders for the scratch of each calculation and that the best is to use QCLOCALSCR?
OK, yes, I believe that could solve the issue, we will try.