RetryableHTTPError not always retried
Caused pipeline to fail for MS230525v and MS230525w. (Mattermost discussion)
Any task wrapped with gracedb.task
instead of app.task
should automatically retry if RetryableHTTPError
(e.g. a 502) is raised if they are called through celery, but there some places with bare calls to gracedb functions that are not wrapped at any level. For example, https://git.ligo.org/emfollow/gwcelery/-/blob/main/gwcelery/tasks/orchestrator.py#L211
I attached the output of a quick git grep
for lines with bare calls. Some of these are wrapped at some stack level and will be retried, but I think some should be properly converted canvases.
possible_bugs.txt
Maybe we can catch bugs in the future by reusing the current unit tests but with the GraceDB client mocked to respond with 502 errors initially every time?