Hi all,
I get an error when my job is launched on the node cpu116 on the partition shared-cpu on Yggdrasil.
I don’t know why, but it seems to happen only on cpu116, because my jobs are running without problems on other nodes of the shared-cpu partition.
Here is the batch script I am using to launch my job
#!/bin/sh
#SBATCH --partition=shared-cpu
#SBATCH --ntasks=10
#SBATCH --time=10:00:00
#SBATCH --mail-type=ALL
#SBATCH -o slurm_map.%j.out # STDOUT
#SBATCH -e slurm_map.%j.err # STDERR
echo "Starting at `date`"
echo "Running on hosts: $SLURM_NODELIST"
echo "Running on $SLURM_NNODES nodes."
echo "Running on $SLURM_NPROCS processors."
echo "Current working directory is `pwd`"
echo ""
echo "***** LAUNCHING *****"
echo `date '+%F %H:%M:%S'`
echo ""
# load Anaconda and OpenMPI
module load Anaconda3
module load foss
echo "Loaded Anaconda3 and foss"
echo ""
srun python3 -m cobaya run input_cmb_lensing_map_fullsky.yaml
echo ""
echo "***** DONE *****"
echo `date '+%F %H:%M:%S'`
echo ""
And here is the error I get when the job starts on cpu116
[cpu116:135582:0:135582] Caught signal 4 (Illegal instruction: illegal operand)
[cpu116:135584:0:135584] Caught signal 4 (Illegal instruction: illegal operand)
[cpu116:135586:0:135586] Caught signal 4 (Illegal instruction: illegal operand)
[cpu116:135583:0:135583] Caught signal 4 (Illegal instruction: illegal operand)
[cpu116:135585:0:135585] Caught signal 4 (Illegal instruction: illegal operand)
[cpu116:135587:0:135587] Caught signal 4 (Illegal instruction: illegal operand)
[cpu116:135589:0:135589] Caught signal 4 (Illegal instruction: illegal operand)
[cpu116:135590:0:135590] Caught signal 4 (Illegal instruction: illegal operand)
[cpu116:135591:0:135591] Caught signal 4 (Illegal instruction: illegal operand)
[cpu116:135588:0:135588] Caught signal 4 (Illegal instruction: illegal operand)
==== backtrace (tid: 135582) ====
0 0x00000000000213e3 ucs_debug_print_backtrace() /dev/shm/ebbuild/UCX/1.10.0/GCCcore-10.3.0/ucx-1.10.0/src/ucs/debug/debug.c:656
1 0x0000000000104d80 __fileutils_MOD_tfilestream_openfile() ???:0
2 0x00000000000ff789 __fileutils_MOD_tfilestream_open() ???:0
3 0x00000000000feffd __fileutils_MOD_readnextcontentline() ???:0
4 0x0000000000027c60 __config_MOD_checkloadedhighltemplate() ???:0
5 0x00000000000069dd ffi_call_unix64() :0
6 0x0000000000006067 ffi_call_int() ffi64.c:0
7 0x000000000001097a _call_function_pointer() /usr/local/src/conda/python-3.8.3/Modules/_ctypes/callproc.c:871
8 0x000000000001097a _ctypes_callproc() /usr/local/src/conda/python-3.8.3/Modules/_ctypes/callproc.c:1199
9 0x00000000000110db PyCFuncPtr_call() /usr/local/src/conda/python-3.8.3/Modules/_ctypes/_ctypes.c:4201
10 0x000000000013d25f _PyObject_MakeTpCall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:159
11 0x00000000001c15e5 _PyObject_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Include/cpython/abstract.h:125
12 0x00000000001c15e5 _PyEval_EvalFrameDefault() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:3500
13 0x000000000020a04d function_code_fastcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:283
14 0x00000000000ff819 _PyObject_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Include/cpython/abstract.h:127
15 0x00000000000ff819 call_function() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:4963
16 0x00000000000ff819 _PyEval_EvalFrameDefault() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:3500
17 0x000000000018a2a2 _PyEval_EvalCodeWithName() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:4298
18 0x000000000018b054 PyEval_EvalCodeEx() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:4327
19 0x00000000002195bc PyEval_EvalCode() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:718
20 0x000000000024e6f3 builtin_exec_impl.isra.14() /tmp/build/80754af9/python_1593706424329/work/Python/bltinmodule.c:1033
21 0x000000000024e6f3 builtin_exec() /tmp/build/80754af9/python_1593706424329/work/Python/clinic/bltinmodule.c.h:396
22 0x0000000000140039 cfunction_vectorcall_FASTCALL() /tmp/build/80754af9/python_1593706424329/work/Objects/methodobject.c:422
23 0x000000000013ca41 PyVectorcall_Call() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:199
24 0x00000000001c6611 do_call_core() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:4983
25 0x00000000001c6611 _PyEval_EvalFrameDefault() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:3559
26 0x000000000018a2a2 _PyEval_EvalCodeWithName() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:4298
27 0x000000000018b243 _PyFunction_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:435
28 0x00000000000ff58e _PyObject_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Include/cpython/abstract.h:127
29 0x00000000000ff58e call_function() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:4963
30 0x00000000000ff58e _PyEval_EvalFrameDefault() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:3469
31 0x000000000018b16b function_code_fastcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:283
32 0x000000000018b16b _PyFunction_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:410
33 0x00000000000ff56d _PyObject_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Include/cpython/abstract.h:127
34 0x00000000000ff56d call_function() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:4963
35 0x00000000000ff56d _PyEval_EvalFrameDefault() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:3486
36 0x000000000018b16b function_code_fastcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:283
37 0x000000000018b16b _PyFunction_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:410
38 0x00000000000ff819 _PyObject_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Include/cpython/abstract.h:127
39 0x00000000000ff819 call_function() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:4963
40 0x00000000000ff819 _PyEval_EvalFrameDefault() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:3500
41 0x000000000018b16b function_code_fastcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:283
42 0x000000000018b16b _PyFunction_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:410
43 0x00000000000ff819 _PyObject_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Include/cpython/abstract.h:127
44 0x00000000000ff819 call_function() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:4963
45 0x00000000000ff819 _PyEval_EvalFrameDefault() /tmp/build/80754af9/python_1593706424329/work/Python/ceval.c:3500
46 0x000000000018b16b function_code_fastcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:283
47 0x000000000018b16b _PyFunction_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:410
48 0x000000000007e299 _PyObject_Vectorcall() /tmp/build/80754af9/python_1593706424329/work/Include/cpython/abstract.h:127
49 0x000000000007e299 _PyObject_FastCall() /tmp/build/80754af9/python_1593706424329/work/Include/cpython/abstract.h:147
50 0x000000000007e299 object_vacall() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:1186
51 0x000000000017d397 _PyObject_CallMethodIdObjArgs() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:1244
52 0x000000000012f786 import_find_and_load() /tmp/build/80754af9/python_1593706424329/work/Python/import.c:1698
53 0x000000000012f786 PyImport_ImportModuleLevelObject() /tmp/build/80754af9/python_1593706424329/work/Python/import.c:1798
54 0x00000000001c7eda builtin___import__() /tmp/build/80754af9/python_1593706424329/work/Python/bltinmodule.c:279
55 0x000000000017f706 cfunction_call_varargs() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:742
56 0x000000000017f706 PyCFunction_Call() /tmp/build/80754af9/python_1593706424329/work/Objects/call.c:772
=================================
srun: error: cpu116: task 0: Illegal instruction