skymap jobs being held with too high memory usage
We've been seeing skymap jobs getting held, as discussed in low latency alert dev calls lately we think it's from skymaps generated from posterior samples because the jobs that run them use python multiprocessing instead of openmp like the skymap jobs. We tried upping the memory request to 16GB in !1086 (merged), but I saw a skymap job get put on held with this message after that:
100850.3 emfollow-playg 3/22 18:17 Error from slot2@node2072.cluster.ldas.cit: Job has gone over memory limit of 16384 megabytes. Peak usage: 94068 megabytes