Suggestions for improvements to CI pipelines
@adam-mercer @duncanmmacleod I have a few suggestions for improvements to the CI pipelines. Admittedly I don't understand a lot of how these now work, they've got a bit too complicated for me, so I'm not sure whether these are do-able.
-
I noticed jobs most often fail (apart from when they're a genuine LALSuite build/test failure) when they're trying to download packages, e.g. from a remote repository, and that repository happens to be down at that point. Would it be possible to build all CI jobs within Docker containers that have all the prerequisite packages, etc. downloaded already? That would mean jobs would not be reliant on internet access to remote servers to complete. -
If the above isn't possible, configure the CI jobs to be retried at least once through https://docs.gitlab.com/ee/ci/yaml/#retry. I would set this to retry on any failure, since I think with the deb/rpm/conda builds it's not easy to distinguish LALSuite build/test failures from other types of failures, e.g. with downloads. While this might mean some jobs are re-run needlessly (if it's a genuine LALSuite build/test failure, a rerun wouldn't fix that), it would make the CI pipeline more robust to the transient failures (e.g. timed out/inaccessible remote servers).
Edited by Karl Wette