CI: increase default:retry:max to 2
Detailed Description
LALSuite CI Jobs may run on Kubernetes runners at INFN-CNAF. These CI jobs run as "Pods" which appear to start, but then have to wait for resources, and seem to often time out while waiting; see here for a recent example. The CI jobs then fail and have to be manually restarted by the user.
Luckily these failures register as a runner system failure and so can be retried automatically, without also needlessly retrying CI jobs that fail due a script error (e.g. a bug). The LALSuite CI already retries jobs that fail for various system-related reasons (see here for the docs):
- unknown_failure
- stuck_or_timeout_failure
- runner_system_failure
- stale_schedule
- archived_failure
- scheduler_failure
Currently jobs that fail for these reasons are retried once (default:retry:max = 1). Because the Kubernetes runners seem to fail more often, this MR increases default:retry:max to 2, which is the maximum GitLab allows. This may reduce the number of times users have to manually restart jobs that have failed too many times.
API Changes
Please tick one of the following options:
- 
These changes do not modify the API. 
- 
These changes do modify the API, and are backwards compatible. 
- 
These changes do modify the API, and are backwards incompatible. 
For examples of changes that do not modify the API and/or are considered backwards (in)compatible, please see the contributing guide.
Justification for Backwards Incompatible Changes
n/a
Review Status
n/a