Skip to content

Rewrite VOEvent sender/receiver as Celery bootstep

Leo P. Singer requested to merge leo-singer/gwcelery:comet-bootstep into master

Rewrite VOEvent sender/receiver as Celery bootstep

This is a major rewrite of the VOEvent subsystem to improve robustness and monitoring capabilites.

  • We used to start the Comet broker as a subprocess inside an eternal task. Instead, start it as a thread in a custom boot step, which is a sort of Celery plugin for adding extra functionality to a worker. This avoids unnecessarily tying up a thread from the worker thread pool and makes it easier to tell the broker when to start and stop.
  • Now that we are not relying on the celery beat periodic scheduler for keeping the VOEvent broker running, we can explore other forms of fault tolerance such as using the celery gossip feature to elect a new VOEvent broker by consensus among multiple Celery workers at physically separate sites.
  • Do a better job of meshing together Celery and Twisted in terms of logging and signal handling.
  • Because the Comet broker is running inside the same process as a worker, the send task can directly command the broker to send a VOEvent, and we don't have to maintain a dedicated local TCP port for this purpose. This makes it harder to contrive a scenario where a VOEvent could be dropped during the handoff from a Celery task to the broker.
  • The new send task will fail and automatically retry later if there are no peers connected to the broker.
  • Add monitoring functionality so that the addresses of all peers connected to the VOevent broker will appear in Flower.

Compare this ASCII art diagram of the old sending architecture:

+--|worker 1|---------------------------------+
|                                             |
|   +--|send task|------------------------+   |
|   |                                     |   |
|   |   +--|comet process|------------+   |   |
|   |   |                             |   |   |
|   |   |   +--|sender|-------+       |   |   |
|   |   |   |                 |       |   |   |
|   |   |   |                 ----+   |   |   |
|   |   |   |                 |   |   |   |   |
|   |   |   +-----------------+   |   |   |   |
|   |   |                         |   |   |   |
|   |   +------------------------|||--+   |   |
|   |                             |       |   |
|   +----------------------------|||------+   |
|                                 |           |
+--------------------------------|||----------+
                                  |
                                  |
                                  |TCP
                                  |port
                                  |
                                  |
+--|worker 2|--------------------|||----------+
|                                 |           |
|   +--|broker task|-------------|||------+   |
|   |                             |       |   |
|   |   +--|comet process|-------|||--+   |   |
|   |   |                         |   |   |   |
|   |   |   +--|receiver|-----+   |   |   |   |
|   |   |   |                 |   |   |   |   |
|   |   |   |                 <---+   |   |   |
|   |   |   |                 |       |   |   |
|   |   |   |                 ----+   |   |   |
|   |   |   |                 |   |   |   |   |
|   |   |   +-----------------+   |   |   |   |
|   |   |                         |   |   |   |
|   |   |   +--|broadcaster|--+   |   |   |   |
|   |   |   |                 |   |   |   |   |
|   |   |   |                 <---+   |   |   |
|   |   |   |                 |       |   |   |
|   |   |   |                 ----+   |   |   |
|   |   |   |                 |   |   |   |   |
|   |   |   +-----------------+   |   |   |   |
|   |   |                         |   |   |   |
|   |   +------------------------|||--+   |   |
|   |                             |       |   |
|   +----------------------------|||------+   |
|                                 |           |
+--------------------------------|||----------+
                                  |
                                  |
                                  |TCP
                                  |port
                                  |
                                  |
                                  +----> to GCN

to this diagram of the new architecture implemented by this patch:

+--|worker 1|-------------------------+
|                                     |
|   +--|send task|----------------+   |
|   |                             |   |
|   |                             |   |
|   |                             |   |
|   +-------------------------|---+   |
|                             |       |
|                             |       |
|   +--|twisted thread|------|||--+   |
|   |                         |   |   |
|   |   +--|receiver|-----+   |   |   |
|   |   |                 |   |   |   |
|   |   |                 <---+   |   |
|   |   |                 |       |   |
|   |   |                 ----+   |   |
|   |   |                 |   |   |   |
|   |   +-----------------+   |   |   |
|   |                         |   |   |
|   |   +--|broadcaster|--+   |   |   |
|   |   |                 |   |   |   |
|   |   |                 <---+   |   |
|   |   |                 |       |   |
|   |   |                 ----+   |   |
|   |   |                 |   |   |   |
|   |   +-----------------+   |   |   |
|   |                         |   |   |
|   +------------------------|||--+   |
|                             |       |
+----------------------------|||------+
                              |
                              |
                              |TCP
                              |port
                              |
                              |
                              +----> to GCN
Edited by Leo P. Singer

Merge request reports