Abuse of CPU resources
The request-cpus
parameter in the ini file can result in request-cpus**2
cpus being used in condor jobs, leading to accidental abuse of CPU resources.
Observed using the IGWN conda environments with dynesty for local CIT and non-local OSG jobs:
- Condor default behavior sets the environment variable
OMP_NUM_THREADS
to the value of therequest_cpus
directive in the submit file [1]. - Bilby/dynesty starts
request_cpus
number of processes in a multiprocessing pool. - When
OMP_NUM_THREADS
is found in the environment, each of those processes spawns that many threads (This is a known feature of numpy but I'm yet to confirm which library is actually multithreading) - Resulting CPU usage:
request_cpus x OMP_NUM_THREADS = request_cpus**2
, instead of the intended value.
To fix: Bilby should set OMP_NUM_THREADS
explicitly by including the following line in the submit files for any multi-process jobs:
environment = "OMP_NUM_THREADS=1"