Nagios doesn't flag email bootstep being down
@peter-shawhan pointed out on mattermost (link) superevents on playground were no longer showing the standard "email notice corresponding to ..." log message, and that the last superevent on playground to have the log message was S240319af, which was submitted to gracedb at 02:57:24 UTC on Mar 19.
Digging into the gwcelery-worker.log
on playground, I found that the email bootstep in the worker shutdown at 03:07:36 UTC on the 19th, Traceback from the log pasted below.
We need to add a check to the icinga monitor to look for this and investigate modifying the bootstep to restart itself if it shuts down.
[2024-03-18 20:07:36,755: INFO/MainProcess/EmailClientThread] Connection closed
[2024-03-18 20:07:36,782: WARNING/MainProcess/EmailClientThread] Exception in thread
[2024-03-18 20:07:36,784: WARNING/MainProcess/EmailClientThread] EmailClientThread
[2024-03-18 20:07:36,785: WARNING/MainProcess/EmailClientThread] :
[2024-03-18 20:07:36,788: WARNING/MainProcess/EmailClientThread] Traceback (most recent call last):
[2024-03-18 20:07:36,789: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/threading.py", line 980, in _bootstrap_inner
[2024-03-18 20:07:36,791: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,792: WARNING/MainProcess/EmailClientThread] self.run()
[2024-03-18 20:07:36,793: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py", line 72, in run
[2024-03-18 20:07:36,795: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,796: WARNING/MainProcess/EmailClientThread] reraise(*_capture_exception())
[2024-03-18 20:07:36,797: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/sentry_sdk/_compat.py", line 127, in reraise
[2024-03-18 20:07:36,798: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,799: WARNING/MainProcess/EmailClientThread] raise value
[2024-03-18 20:07:36,800: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py", line 70, in run
[2024-03-18 20:07:36,801: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,802: WARNING/MainProcess/EmailClientThread] return old_run_func(self, *a, **kw)
[2024-03-18 20:07:36,803: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/threading.py", line 917, in run
[2024-03-18 20:07:36,803: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,804: WARNING/MainProcess/EmailClientThread] self._target(*self._args, **self._kwargs)
[2024-03-18 20:07:36,805: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/gwcelery/email/bootsteps.py", line 73, in _runloop
[2024-03-18 20:07:36,806: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,807: WARNING/MainProcess/EmailClientThread] conn.idle_done()
[2024-03-18 20:07:36,809: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/imapclient/imapclient.py", line 179, in wrapper
[2024-03-18 20:07:36,811: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,812: WARNING/MainProcess/EmailClientThread] return func(client, *args, **kwargs)
[2024-03-18 20:07:36,812: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/imapclient/imapclient.py", line 999, in idle_done
[2024-03-18 20:07:36,814: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,814: WARNING/MainProcess/EmailClientThread] return self._consume_until_tagged_response(self._idle_tag, "IDLE")
[2024-03-18 20:07:36,815: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/imapclient/imapclient.py", line 1644, in _consume_until_tagged_response
[2024-03-18 20:07:36,816: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,817: WARNING/MainProcess/EmailClientThread] line = self._imap._get_response()
[2024-03-18 20:07:36,818: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/imaplib.py", line 1075, in _get_response
[2024-03-18 20:07:36,819: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,820: WARNING/MainProcess/EmailClientThread] resp = self._get_line()
[2024-03-18 20:07:36,821: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/imaplib.py", line 1183, in _get_line
[2024-03-18 20:07:36,822: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,822: WARNING/MainProcess/EmailClientThread] line = self.readline()
[2024-03-18 20:07:36,823: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/imapclient/tls.py", line 62, in readline
[2024-03-18 20:07:36,825: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,825: WARNING/MainProcess/EmailClientThread] return self.file.readline()
[2024-03-18 20:07:36,826: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/socket.py", line 704, in readinto
[2024-03-18 20:07:36,828: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,828: WARNING/MainProcess/EmailClientThread] return self._sock.recv_into(b)
[2024-03-18 20:07:36,829: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/ssl.py", line 1275, in recv_into
[2024-03-18 20:07:36,830: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,831: WARNING/MainProcess/EmailClientThread] return self.read(nbytes, buffer)
[2024-03-18 20:07:36,832: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/ssl.py", line 1133, in read
[2024-03-18 20:07:36,833: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,834: WARNING/MainProcess/EmailClientThread] return self._sslobj.read(len, buffer)
[2024-03-18 20:07:36,835: WARNING/MainProcess/EmailClientThread] ConnectionResetError
[2024-03-18 20:07:36,835: WARNING/MainProcess/EmailClientThread] :
[2024-03-18 20:07:36,836: WARNING/MainProcess/EmailClientThread] [Errno 104] Connection reset by peer