Hi,
please see comments below about your sbatch script.
Line6: you can remove it (#SBATCH --nodes=1
) only needed if you want more than one compute node (for a good reason)
Line10: be sure to only request this option if default memory isn’t enough. If you don’t know, you can remove it.
Line18: if you use our version of netCDF, why do you still need your virtualenv? As you are activating your virtualenv without any Python module loaded, you’ll get the default Python from system which is quite old. You probably need to remove the line.
Line21: you are loading GCC 13.2
Line22: you are loading GCC 11.3 => this will unload some of the modules loades from your previous load!
Line27: you are loading netcdf4-python but without the required dependencies, thus it fails.
Rules of thumb: you need to load all the modules for a single GCC version, i.e. choose one and stick to it.
Pandas
is now provided by the module SciPy-bundle
. You can see what is provided by this module like that:
(baobab)-[sagon@login1 ~]$ ml spider SciPy-bundle/2023.07
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SciPy-bundle: SciPy-bundle/2023.07
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Description:
Bundle of Python packages for scientific software
You will need to load all module(s) on any one of the lines below before the "SciPy-bundle/2023.07" module is available to load.
GCC/12.3.0
Help:
Description
===========
Bundle of Python packages for scientific software
More information
================
- Homepage: https://python.org/
Included extensions
===================
beniget-0.4.1, Bottleneck-1.3.7, deap-1.4.0, gast-0.5.4, mpmath-1.3.0,
numexpr-2.8.4, numpy-1.25.1, pandas-2.0.3, ply-3.11, pythran-0.13.1,
scipy-1.11.1, tzdata-2023.3, versioneer-0.29
Numpy is as well included in the same module. And SciPy-bundle is a dependency for netcdf4-python/1.5.7, so no need to load it, it is automatically loaded. You just need to add xarray/0.19.0
and that’s all.
New sbatch:
#!/bin/bash
#SBATCH --job-name=python # Job name
#SBATCH --output=python_output_%j.log # Output file (log of your job)
#SBATCH --error=python_error_%j.log # Error file (log of errors)
#SBATCH --partition=private-gap-cpu # Partition (change if needed)
#SBATCH --ntasks=1 # Number of tasks (1 task for serial jobs)
#SBATCH --cpus-per-task=4 # Number of CPU cores per task
#SBATCH --time=00:10:00 # Maximum run time (hh:mm:ss)
# Load any necessary modules (adjust to your cluster's environment)
module purge # Clear any previously loaded modules
# load Numpy, xarray, netcdf4-python, pandas
ml GCC/10.3.0 OpenMPI/4.1.1 netcdf4-python/1.5.7 xarray/0.19.0
# Run your Python script
srun python3 /home/users/u/utkina1/scratch/test.py