Intermitter server gateway timeouts
Created on April 5, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5418)
Several follow-up processes (approval processor, event supervisor, probably others) and one search pipeline (cWB) have reported receiving 504 gateway timeout errors when attempting to write log messages to GraceDB. Peter S reports that it happens for approval processor about once per day. It seems as though the log is still written, but the correct response is not sent, as the connection hangs for 2 minutes, then terminates.
The server also accumulates lingering threads owned by the wsgi_daemon user over time. It's not clear if these two issues are related.
I upgraded the production server to mod_wsgi 4.5.11 on 28 Mar 2017 in the hopes that it would take care of the lingering threads (tests on the development servers indicated that it cleared up lingering threads caused by overloading the server with log write processes), but it hasn't.