gwcelery issueshttps://git.ligo.org/emfollow/gwcelery/-/issues2024-03-27T15:13:23Zhttps://git.ligo.org/emfollow/gwcelery/-/issues/539Ingest GRB candidates from SVOM2024-03-27T15:13:23ZBrandon PiotrzkowskiIngest GRB candidates from SVOMThe Space Variable Objects Monitor (SVOM) will be hopefully launched in late 2023, making it an attractive mission to add to the RAVEN workflow. This requires the following:
- [x] Create/modify GCN or Kafka listener to ingest events fro...The Space Variable Objects Monitor (SVOM) will be hopefully launched in late 2023, making it an attractive mission to add to the RAVEN workflow. This requires the following:
- [x] Create/modify GCN or Kafka listener to ingest events from SVOM
- [x] Modify GCN VOEvent ingestion for new notice type (see https://git.ligo.org/computing/gracedb/server/-/issues/338)
- [x] Ensure SVOM events can create external events either in their current form or with some wrapper function (i.e. get example GCN notice or equivalent and see if can be uploaded directly)
- [x] Contact GraceDB developers to add `pipeline='SVOM'` if not already available (see https://git.ligo.org/computing/gracedb/server/-/issues/255)
- [ ] Modify GRB listener if similar enough to current workflow in order to fully include in RAVEN pipeline. Determine every notice type we would like to listen to.
- [ ] Determine expected rate (#/year) of additional detected GRBs (i.e. independent of Fermi-GBM, Swift-BAT, INTEGRAL, and AGILE-MCAL) and whether the individual significance (FAR?) of the GRB is available or relevant (e.g. is very high significance so that `FAR_FRB << FAR_GW`?). Increase rates in `ligo-raven` if needed (see https://git.ligo.org/lscsoft/raven/-/blob/7561aeb07c1071722d619398b37fc0f563ecd0b7/ligo/raven/search.py#L499)
- [x] Determine how the GRB sky localization could be downloaded from the experiment or created via existing tools in gwcelery
- [ ] Add relevant values to RAVEN pipeline (time window, `raven.trigger_raven_alert`) and add exceptions when different from standard GRB workflow
- [ ] Add testing via pytests, internal MDCs, and/or O3 replay to ensure system works as expected
Example of SVOM Eclairs notice: [sb23041100_eclairs-wakeup_2.xml](/uploads/786f205addb0045b18c96e8843f3af6f/sb23041100_eclairs-wakeup_2.xml)O4a mid-run releaseNaresh AdhikariNaresh Adhikarihttps://git.ligo.org/emfollow/gwcelery/-/issues/536Create notices and circulars for medium-latency GRB detections2023-06-05T13:26:55ZBrandon PiotrzkowskiCreate notices and circulars for medium-latency GRB detectionsThe GRB group has requested that we support both notices (GCN, I assume Kafka as well) and circulars concerning medium-latency pipelines that follow up GRBs (could be detection or non-detection).
Circulars:
- [x] There is currently code...The GRB group has requested that we support both notices (GCN, I assume Kafka as well) and circulars concerning medium-latency pipelines that follow up GRBs (could be detection or non-detection).
Circulars:
- [x] There is currently code that produces these circulars but could be out-of-date, so need to ensure these are ready for O4 or update them accordingly (see this issues: https://git.ligo.org/emfollow/ligo-followup-advocate/-/issues/68)
Notices:
- [x] Identify whether these notices should be created automatically or require human vetting (similar to how we send update notices via the dashboard: https://emfollow.ligo.caltech.edu/gwcelery/)
- [x] Identify what conditions to create a notice (both detection and non-detection), i.e. what information/labels need to present in an external event or superevent to identify a notice should be sent
- [ ] Identify any additional fields or info not currently included in our alerts (both GCN and Kafka) that need to be added: https://emfollow.docs.ligo.org/userguide/content.html
---
- [ ] Once we've identified the scope, we can start to work out between `external_triggers.py`, `alerts.py`, `views.py`, and GraceDB what needs to be developed.O4a mid-run releaseBrandon PiotrzkowskiBrandon Piotrzkowskihttps://git.ligo.org/emfollow/gwcelery/-/issues/72Get Kamland pre-supernova alerts into gracedb2023-06-05T13:26:55ZPatrick BradyGet Kamland pre-supernova alerts into gracedbQuoting Joe Giaime: "On Friday there was a KamLAND pre-supernova alert, with sigma > 3. LSC members at both LLO and LHO (together with Virgo) cut short commissioning and cajoled the detectors into observation mode at decent sensitivity ...Quoting Joe Giaime: "On Friday there was a KamLAND pre-supernova alert, with sigma > 3. LSC members at both LLO and LHO (together with Virgo) cut short commissioning and cajoled the detectors into observation mode at decent sensitivity until KamLAND revised the number down below the action level, staffed overnight by volunteers. It was very exciting! (see https://ldas-jobs.ligo.caltech.edu/~kats/KamLAND-watch/alert.html )"
Also from James Lough: " am also interested in this for GEO. We have an alert monitor set up that looks at GraceDB. It’s based on one that I think is running at LIGO. But, I don’t see this alert in GraceDB. I think it would be straightforward for someone to pipe the KamLAND alerts into GraceDB. There are SNEWS alerts in GraceDB, but I noticed that the latest test alert didn’t seem to make it in ...."
We should also follow up with the burst group on how RAVEN should respond to these and other possible supernova alerts.O4a mid-run releaseNaresh AdhikariSouradeep Palsouradeep.pal@ligo.orgNaresh Adhikarihttps://git.ligo.org/emfollow/gwcelery/-/issues/784Release Version 2.3.2 "Champ"2024-03-23T00:10:23ZCody MessickRelease Version 2.3.2 "Champ"**Git ref**: 5a4b9417b4a24f22c891d486b6fc3ae015e933de
# Checklist
Skipping checklist as this release is identical to 2.3.1, except public alerts have been enabled. See #783 for acceptance checks.**Git ref**: 5a4b9417b4a24f22c891d486b6fc3ae015e933de
# Checklist
Skipping checklist as this release is identical to 2.3.1, except public alerts have been enabled. See #783 for acceptance checks.https://git.ligo.org/emfollow/gwcelery/-/issues/783Release Version 2.3.1 "Champ"2024-03-23T00:05:38ZCody MessickRelease Version 2.3.1 "Champ"**Git ref**: 8b472edc0c9e4dafc36914677a47fa1989634087
# Checklist
## Basics
1. [x] The CI pipeline succeeded, including all unit tests and code quality checks. https://git.ligo.org/emfollow/gwcelery/-/pipelines/610547
2. [x] [CHANGE...**Git ref**: 8b472edc0c9e4dafc36914677a47fa1989634087
# Checklist
## Basics
1. [x] The CI pipeline succeeded, including all unit tests and code quality checks. https://git.ligo.org/emfollow/gwcelery/-/pipelines/610547
2. [x] [CHANGES.rst](https://git.ligo.org/emfollow/gwcelery/-/blob/release/v2.3/CHANGES.rst) lists all significant changes since the last release. It is free from spelling and grammatical errors.
3. [x] The [latest Readthedocs documentation build](https://readthedocs.org/projects/gwcelery/builds/) passed and the [latest branch docs](https://rtd.igwn.org/projects/gwcelery/en/release-v2.3) are correctly rendered. Autodoc-generated API docs for tasks are shown.
4. [x] If there is [milestone](https://git.ligo.org/emfollow/gwcelery/-/milestones) for this
release, then the list of issues and merge requests that have been
addressed is accurate. Any unaddressed issues and merge requests have been
moved to another milestone.
5. [x] Check the versions of the following packages in the [`poetry.lock`](https://git.ligo.org/emfollow/gwcelery/-/blob/release/v2.3/poetry.lock) file have been approved by the SCCB (i.e. either has the status:deploy or status:deployed label).
- [x] [`bilby`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=bilby&first_page_size=100)
- [x] [`bilby_pipe`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=bilby_pipe&first_page_size=100)
- [x] [`gracedb-sdk`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=gracedb-sdk&first_page_size=100)
- [x] [`gwdatafind`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=gwdatafind&first_page_size=100)
- [x] [`gwpy`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=gwpy&first_page_size=100)
- [x] [`gwskynet`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=gwskynet&first_page_size=100)
- [x] [`igwn-alert`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=igwn-alert&first_page_size=100)
- [x] [`igwn-gwalert-schema`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=igwn-gwalert-schema&first_page_size=20)
- [x] [`lalsuite`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=lalsuite&first_page_size=100)
- [x] [`ligo-followup-advocate`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo-followup-advocate&first_page_size=100)
- [x] [`ligo-gracedb`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo-gracedb&first_page_size=100)
- [x] [`ligo-raven`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo-raven&first_page_size=100)
- [x] [`ligo-segments`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo-segments&first_page_size=20)
- [x] [`ligo.em-bright`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo.em-bright&first_page_size=20)
- [x] [`ligo.skymap`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo.skymap&first_page_size=100)
- [x] [`lscsoft-glue`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=lscsoft-glue&first_page_size=100)
- [x] [`pesummary`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=pesummary&first_page_size=100)
- [x] [`python-ligo-lw`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=python-ligo-lw&first_page_size=100)
- [x] [`rapidpe`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=Rapidpe&first_page_size=20)
- [x] [`rapidpe-rift-pipe`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=RapidPE%20pipeline&first_page_size=20)
- [x] [`RIFT`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=rift&first_page_size=100)
## Test deployment
4. [x] Sentry does not show any new [unresolved issues on ~~test~~playground](https://sentry.io/organizations/ligo-caltech/issues/?environment=playground&groupStatsPeriod=14d&project=1425216&query=is%3Aunresolved&statsPeriod=14d) that indicate new bugs or regressions.
5. [x] The ~~test~~playground deployment has run for at least 10 minutes.
6. [x] The [Flower monitor](https://emfollow-playground.ligo.caltech.edu/flower) is reachable and shows no unexpected task failures.
7. [x] The [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) is reachable.
8. [x] The ~~test~~playground deployment is [connected to IGWN Alert](https://emfollow-playground.ligo.caltech.edu/flower/worker/gwcelery-worker%40emfollow-playground.ligo.caltech.edu#tab-other) (in Flower, find the main gwcelery-worker, click Other, and look at the list of subscribed IGWN Alert topics).
9. [x] The ~~test~~playground deployment is [connected to GCN](https://emfollow-playground.ligo.caltech.edu/flower/worker/gwcelery-voevent-worker%40emfollow-playground.ligo.caltech.edu#tab-other) (in Flower, find the voevent gwcelery-worker, click Other, and look at the list of receiver peers).
## Mock events
10. [x] The ~~test~~playground deployment has [produced an MDC superevent](https://gracedb-playground.ligo.org/latest/?query=MDC&query_type=S).
11. [x] The MDC superevent has the following annotations.
- [x] `bayestar.multiorder.fits`
- [x] `bayestar.fits.gz`
- [x] `bayestar.png`
- [x] `bayestar.volume.png`
- [x] `bayestar.html`
- [x] `p_astro.json`
- [x] `p_astro.png`
- [x] `em_bright.json`
- [x] `em_bright.png`
12. [x] The MDC superevent has the following labels.
- [x] `EMBRIGHT_READY`
- [x] `GCN_PRELIM_SENT`
- [x] `PASTRO_READY`
- [x] `SKYMAP_READY`
13. [x] The MDC superevent has two automatic preliminary VOEvents, JSON packets, and Avro packets if `GCN_PRELIM_SENT` is applied.
- [x] 2 preliminary VOEvents
- [x] 2 preliminary JSON packets
- [x] 2 preliminary Avro packets
14. [x] Issuing a manual preliminary alert from the [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) sends another preliminary alert.
- [ ] The alert **is sent** successfully if `ADVOK` or an `ADVNO` label is **not applied** this time.
- [x] Alternatively, a preliminary alert is **blocked** due to presence of `ADVOK` or `ADVNO`.
15. [x] `DQR_REQUEST` label is applied to the superevent. The application happens at the time of launching the second preliminary alert.
16. [x] The MDC superevent has either an `ADVOK` or an `ADVNO` label.
17. [x] Issuing an `ADVOK` signoff through GraceDB results in an initial VOEvent.
18. [x] Issuing an `ADVNO` signoff through GraceDB results in a retraction VOEvent.
19. [x] Requesting an update alert through the [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) results in an update VOEvent.
20. [x] ~~Test~~Playground has recently [produced an MDC superevent with an external coincidence](https://gracedb-playground.ligo.org/latest/?query=MDC+EM_COINC&query_type=S), i.e. with an `EM_COINC` label. Use the [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) to do this manually (note that joint events with Swift may not pass publishing conditions and or have a combined sky map, indicated by the lack of `RAVEN_ALERT` and `COMBINEDSKYMAP_READY` label respectively).
21. [x] The joint MDC superevent has the following annotations.
- [x] `coincidence_far.json`
- [x] `combined-ext.multiorder.fits` or `combined-ext.fits.gz`
- [x] `combined-ext.png`
- [x] `overlap_integral.png`
22. [x] The joint MDC superevent has the following labels.
- [x] `EM_COINC`
- [x] `RAVEN_ALERT`
- [x] `COMBINEDSKYMAP_READY`
- [x] `GCN_PRELIM_SENT`
23. [x] The joint MDC superevent is sending alerts with coincidence information.
- [x] At least one VOEvent with `<Group name="External Coincidence">`.
- [x] At least one Kafka JSON packet with an `external_coinc` field.
- [x] At least one circular w/ `-emcoinc-` in filename.
24. [x] Issue a manual RAVEN alert using the [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) for a coincidence (i.e. has `EM_COINC` label) that has does not have the `RAVEN_ALERT` label yet. Choose a [recent joint coincidence that meets this criteria](https://gracedb-playground.ligo.org/latest/?query=MDC+%7ERAVEN_ALERT+%26+EM_COINC&query_type=S&get_neighbors=&results_format=) and ensure that a `RAVEN_ALERT` label is applied to the associated superevent, external event, and preferred event.
## Replay events
24. [x] [A Production superevent labeled `GCN_PRELIM_SENT`](https://gracedb-playground.ligo.org/latest/?query=Production+GCN_PRELIM_SENT&query_type=S&get_neighbors=&results_format=) has the following parameter estimation annotations and the `PE_READY` label.
- [x] `bilby_config.ini`
- [x] `Bilby.posterior_samples.hdf5`
- [x] `Bilby.multiorder.fits`
- [x] `Bilby.html`
- [x] `Bilby.fits.gz`
- [x] `Bilby.png`
- [x] `Bilby.volume.png`
- [x] `PE_READY`
- [x] Link to PEsummary page (log message in parameter estimation section)https://git.ligo.org/emfollow/gwcelery/-/issues/781Nagios doesn't flag email bootstep being down2024-03-26T17:03:02ZCody MessickNagios doesn't flag email bootstep being down@peter-shawhan pointed out on mattermost ([link](email notice corresponding to ...)) superevents on playground were no longer showing the standard "email notice corresponding to ..." log message, and that the last superevent on playgroun...@peter-shawhan pointed out on mattermost ([link](email notice corresponding to ...)) superevents on playground were no longer showing the standard "email notice corresponding to ..." log message, and that the last superevent on playground to have the log message was [S240319af](https://gracedb-playground.ligo.org/superevents/S240319af/view/), which was submitted to gracedb at 02:57:24 UTC on Mar 19.
Digging into the `gwcelery-worker.log` on playground, I found that the email bootstep in the worker shutdown at 03:07:36 UTC on the 19th, Traceback from the log pasted below.
We need to add a check to the icinga monitor to look for this and investigate modifying the bootstep to restart itself if it shuts down.
```
[2024-03-18 20:07:36,755: INFO/MainProcess/EmailClientThread] Connection closed
[2024-03-18 20:07:36,782: WARNING/MainProcess/EmailClientThread] Exception in thread
[2024-03-18 20:07:36,784: WARNING/MainProcess/EmailClientThread] EmailClientThread
[2024-03-18 20:07:36,785: WARNING/MainProcess/EmailClientThread] :
[2024-03-18 20:07:36,788: WARNING/MainProcess/EmailClientThread] Traceback (most recent call last):
[2024-03-18 20:07:36,789: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/threading.py", line 980, in _bootstrap_inner
[2024-03-18 20:07:36,791: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,792: WARNING/MainProcess/EmailClientThread] self.run()
[2024-03-18 20:07:36,793: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py", line 72, in run
[2024-03-18 20:07:36,795: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,796: WARNING/MainProcess/EmailClientThread] reraise(*_capture_exception())
[2024-03-18 20:07:36,797: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/sentry_sdk/_compat.py", line 127, in reraise
[2024-03-18 20:07:36,798: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,799: WARNING/MainProcess/EmailClientThread] raise value
[2024-03-18 20:07:36,800: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py", line 70, in run
[2024-03-18 20:07:36,801: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,802: WARNING/MainProcess/EmailClientThread] return old_run_func(self, *a, **kw)
[2024-03-18 20:07:36,803: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/threading.py", line 917, in run
[2024-03-18 20:07:36,803: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,804: WARNING/MainProcess/EmailClientThread] self._target(*self._args, **self._kwargs)
[2024-03-18 20:07:36,805: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/gwcelery/email/bootsteps.py", line 73, in _runloop
[2024-03-18 20:07:36,806: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,807: WARNING/MainProcess/EmailClientThread] conn.idle_done()
[2024-03-18 20:07:36,809: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/imapclient/imapclient.py", line 179, in wrapper
[2024-03-18 20:07:36,811: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,812: WARNING/MainProcess/EmailClientThread] return func(client, *args, **kwargs)
[2024-03-18 20:07:36,812: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/imapclient/imapclient.py", line 999, in idle_done
[2024-03-18 20:07:36,814: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,814: WARNING/MainProcess/EmailClientThread] return self._consume_until_tagged_response(self._idle_tag, "IDLE")
[2024-03-18 20:07:36,815: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/imapclient/imapclient.py", line 1644, in _consume_until_tagged_response
[2024-03-18 20:07:36,816: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,817: WARNING/MainProcess/EmailClientThread] line = self._imap._get_response()
[2024-03-18 20:07:36,818: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/imaplib.py", line 1075, in _get_response
[2024-03-18 20:07:36,819: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,820: WARNING/MainProcess/EmailClientThread] resp = self._get_line()
[2024-03-18 20:07:36,821: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/imaplib.py", line 1183, in _get_line
[2024-03-18 20:07:36,822: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,822: WARNING/MainProcess/EmailClientThread] line = self.readline()
[2024-03-18 20:07:36,823: WARNING/MainProcess/EmailClientThread] File "/home/emfollow-playground/.local/lib/python3.9/site-packages/imapclient/tls.py", line 62, in readline
[2024-03-18 20:07:36,825: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,825: WARNING/MainProcess/EmailClientThread] return self.file.readline()
[2024-03-18 20:07:36,826: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/socket.py", line 704, in readinto
[2024-03-18 20:07:36,828: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,828: WARNING/MainProcess/EmailClientThread] return self._sock.recv_into(b)
[2024-03-18 20:07:36,829: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/ssl.py", line 1275, in recv_into
[2024-03-18 20:07:36,830: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,831: WARNING/MainProcess/EmailClientThread] return self.read(nbytes, buffer)
[2024-03-18 20:07:36,832: WARNING/MainProcess/EmailClientThread] File "/usr/lib64/python3.9/ssl.py", line 1133, in read
[2024-03-18 20:07:36,833: WARNING/MainProcess/EmailClientThread]
[2024-03-18 20:07:36,834: WARNING/MainProcess/EmailClientThread] return self._sslobj.read(len, buffer)
[2024-03-18 20:07:36,835: WARNING/MainProcess/EmailClientThread] ConnectionResetError
[2024-03-18 20:07:36,835: WARNING/MainProcess/EmailClientThread] :
[2024-03-18 20:07:36,836: WARNING/MainProcess/EmailClientThread] [Errno 104] Connection reset by peer
```GWCelery v2.4.1 Releasehttps://git.ligo.org/emfollow/gwcelery/-/issues/780Follow-up from "Resolve "Update monitoring documentation""2024-03-19T15:30:44ZDeep Chatterjeedeep.chatterjee@ligo.orgFollow-up from "Resolve "Update monitoring documentation""This is an issue to follow-up from !1284:
- A TODO item to add instructions to subscribe to Sentry,
- Any other items...This is an issue to follow-up from !1284:
- A TODO item to add instructions to subscribe to Sentry,
- Any other items...https://git.ligo.org/emfollow/gwcelery/-/issues/779Check raven test coverage2024-03-20T16:46:42ZCody MessickCheck raven test coverage#778 wasn't caught by unit tests, this issue is just a reminder to determine why and to fix.#778 wasn't caught by unit tests, this issue is just a reminder to determine why and to fix.GWCelery v2.4.1 Releasehttps://git.ligo.org/emfollow/gwcelery/-/issues/777Switch to listen to external events primarily over Kafka2024-03-25T21:43:33ZBrandon PiotrzkowskiSwitch to listen to external events primarily over KafkaCurrently we listen to the majority of external events via GCN classic, which has numerous issues and will eventually be phased out to be superseded by the GCN Kafka system. See: https://gcn.nasa.gov/
This could be done in two phases:
1...Currently we listen to the majority of external events via GCN classic, which has numerous issues and will eventually be phased out to be superseded by the GCN Kafka system. See: https://gcn.nasa.gov/
This could be done in two phases:
1. Switch to from GCN classic to GCN classic over Kafka, which won't require any ingestion changes
2. Once GCN Kafka (JSON format) is ready, change ingestion methods over. This second phase will require the following changes:
- [ ] Ingest JSON natively in GraceDB: https://git.ligo.org/computing/gracedb/server/-/issues/297
- [ ] Change workflow in `external_triggers.py` to use JSON packets when ingestingpost-O4Brandon PiotrzkowskiNaresh AdhikariBrandon Piotrzkowskihttps://git.ligo.org/emfollow/gwcelery/-/issues/776rapidpe won't run at all if first event is earlywarning2024-03-15T17:19:59ZCody Messickrapidpe won't run at all if first event is earlywarningCurrently, rapid-pe is only started when we receive a new-type igwn alert for a superevent. !1405 made it so that rapidpe is not started for early warning events, which essentially means rapidpe will never run for candidates that we see ...Currently, rapid-pe is only started when we receive a new-type igwn alert for a superevent. !1405 made it so that rapidpe is not started for early warning events, which essentially means rapidpe will never run for candidates that we see in early warning, even once we've seen a full bandwidth trigger.GWCelery v2.4.1 ReleaseCody MessickCody Messickhttps://git.ligo.org/emfollow/gwcelery/-/issues/775Follow-up from "search based alert threshold"; Create test coverage for multi...2024-03-14T21:32:14ZBrandon PiotrzkowskiFollow-up from "search based alert threshold"; Create test coverage for multiple superevent searches in RAVENThe following discussion from !1384 should be addressed:
- [ ] @brandon.piotrzkowski started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1384#note_965009):
> Just adding a note we will need to add testin...The following discussion from !1384 should be addressed:
- [ ] @brandon.piotrzkowski started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1384#note_965009):
> Just adding a note we will need to add testing coverage in `test_tasks_raven.py` for different searches, will link to a new issue.GWCelery v2.4.1 ReleaseBrandon PiotrzkowskiBrandon Piotrzkowskihttps://git.ligo.org/emfollow/gwcelery/-/issues/774gwskynet unit test fails with new tensorflow release2024-03-26T00:39:34ZCody Messickgwskynet unit test fails with new tensorflow releaseOur CI tests that use bleeding edge dependencies started failing after the latest release of tensorflow. The gwskynet unit test throws complains `TypeError: Could not locate class 'Functional'.` followed by a very long message. An exampl...Our CI tests that use bleeding edge dependencies started failing after the latest release of tensorflow. The gwskynet unit test throws complains `TypeError: Could not locate class 'Functional'.` followed by a very long message. An example of a failed job can be seen [here](https://git.ligo.org/cody.messick/gwcelery/-/jobs/3227642). Pinning tensorflow-cpu < 2.16 fixes the issue, but we should drop that pin once it's fixed on gwskynet's end.ManLeong ChanManLeong Chanhttps://git.ligo.org/emfollow/gwcelery/-/issues/773Follow-up from "search based alert threshold": Add BBH search to fix RAVEN tr...2024-03-14T18:20:59ZBrandon PiotrzkowskiFollow-up from "search based alert threshold": Add BBH search to fix RAVEN trials factors or update trials factorsThe following discussion from !1384 should be addressed:
- [ ] @brandon.piotrzkowski started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1384#note_960807): (+3 comments)
> Currently we aren't listening t...The following discussion from !1384 should be addressed:
- [ ] @brandon.piotrzkowski started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1384#note_960807): (+3 comments)
> Currently we aren't listening to `bbh` or `imbh` events in RAVEN, which will mess up the trials factors when called here. That could be fixed (in another MR) by adding an additional `raven.coincidence_search` with these `searches`.GWCelery v2.4.1 ReleaseBrandon PiotrzkowskiBrandon Piotrzkowskihttps://git.ligo.org/emfollow/gwcelery/-/issues/772Reduce maximum order for incoming external sky maps2024-03-07T00:33:18ZBrandon PiotrzkowskiReduce maximum order for incoming external sky mapsWe've been contacted by multiple experiments (Fermi, Swift, IPN) that would like to send us sky maps that are larger than we can send alerts with (>10MB in some cases).
We should provide a function that checks the length for incoming sk...We've been contacted by multiple experiments (Fermi, Swift, IPN) that would like to send us sky maps that are larger than we can send alerts with (>10MB in some cases).
We should provide a function that checks the length for incoming sky maps and will then reduce the maximum order to an acceptable amount by combining pixels in groups of four to one larger parent pixel, recursively.
```
PROBDENSITY
| ------ | ------ | | ------ |
| 1e-5 | 1e-5 | ====\ | 1e-5 |
| 1e-5 | 1e-5 | ====/ | ------ |
| ------ | ------ |
```O4bBrandon PiotrzkowskiBrandon Piotrzkowskihttps://git.ligo.org/emfollow/gwcelery/-/issues/771cWB should upload as CBC pipeline instead of Burst pipeline2024-03-15T17:19:19ZCody MessickcWB should upload as CBC pipeline instead of Burst pipelineFor the sake of expediency, cWB bbh is uploading to the Burst group. Sending CBC notices from the Burst group requires some hacky workarounds that we need to remove once cWB can upload to the CBC group.
For them to upload to the CBC gro...For the sake of expediency, cWB bbh is uploading to the Burst group. Sending CBC notices from the Burst group requires some hacky workarounds that we need to remove once cWB can upload to the CBC group.
For them to upload to the CBC group, we need to...(probably incomplete list)
- [ ] Disable bayestar for cwb.bbh uploads
- [ ] Revert (or close if not merged) !1419
- [ ] ..?post-O4https://git.ligo.org/emfollow/gwcelery/-/issues/767Add test coverage for ignoring earlywarning events with rapidpe2024-02-23T19:08:36ZCody MessickAdd test coverage for ignoring earlywarning events with rapidpeFollowing up Brandon's comment on !1405, specifically https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1405#note_951180Following up Brandon's comment on !1405, specifically https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1405#note_951180https://git.ligo.org/emfollow/gwcelery/-/issues/765Release version 2.2.1 "Sheepsquatch"2024-03-08T22:38:01ZCody MessickRelease version 2.2.1 "Sheepsquatch"**Git ref**: fed9d492bab4ef428bd704579fee773329ed10ce
# Checklist
## Basics
1. [x] The CI pipeline succeeded, including all unit tests and code quality checks. https://git.ligo.org/emfollow/gwcelery/-/pipelines/603703
2. [ ] [CHANGE...**Git ref**: fed9d492bab4ef428bd704579fee773329ed10ce
# Checklist
## Basics
1. [x] The CI pipeline succeeded, including all unit tests and code quality checks. https://git.ligo.org/emfollow/gwcelery/-/pipelines/603703
2. [ ] [CHANGES.rst](https://git.ligo.org/emfollow/gwcelery/-/blob/release/v2.2/CHANGES.rst) lists all significant changes since the last release. It is free from spelling and grammatical errors.
3. [x] The [latest Readthedocs documentation build](https://readthedocs.org/projects/gwcelery/builds/) passed and the [latest branch docs](https://rtd.igwn.org/projects/gwcelery/en/release-v2.2) are correctly rendered. Autodoc-generated API docs for tasks are shown.
4. [x] If there is [milestone](https://git.ligo.org/emfollow/gwcelery/-/milestones) for this
release, then the list of issues and merge requests that have been
addressed is accurate. Any unaddressed issues and merge requests have been
moved to another milestone.
5. [ ] Check the versions of the following packages in the [`poetry.lock`](https://git.ligo.org/emfollow/gwcelery/-/blob/release/v2.2/poetry.lock) file have been approved by the SCCB (i.e. either has the status:deploy or status:deployed label).
- [x] [`bilby`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=bilby&first_page_size=100)
- [x] [`bilby_pipe`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=bilby_pipe&first_page_size=100)
- [x] [`gracedb-sdk`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=gracedb-sdk&first_page_size=100)
- [x] [`gwdatafind`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=gwdatafind&first_page_size=100)
- [x] [`gwpy`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=gwpy&first_page_size=100)
- [ ] [`gwskynet`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=gwskynet&first_page_size=100)
- [x] [`igwn-alert`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=igwn-alert&first_page_size=100)
- [x] [`igwn-gwalert-schema`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=igwn-gwalert-schema&first_page_size=20)
- [x] [`lalsuite`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=lalsuite&first_page_size=100)
- [x] [`ligo-followup-advocate`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo-followup-advocate&first_page_size=100)
- [x] [`ligo-gracedb`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo-gracedb&first_page_size=100)
- [x] [`ligo-raven`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo-raven&first_page_size=100)
- [x] [`ligo-segments`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo-segments&first_page_size=20)
- [x] [`ligo.em-bright`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo.em-bright&first_page_size=20)
- [x] [`ligo.skymap`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=ligo.skymap&first_page_size=100)
- [x] [`lscsoft-glue`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=lscsoft-glue&first_page_size=100)
- [x] [`pesummary`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=pesummary&first_page_size=100)
- [x] [`python-ligo-lw`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=python-ligo-lw&first_page_size=100)
- [x] [`rapidpe`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=Rapidpe&first_page_size=20)
- [ ] [`rapidpe-rift-pipe`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=RapidPE%20pipeline&first_page_size=20)
- [x] [`RIFT`](https://git.ligo.org/computing/sccb/-/issues/?sort=updated_desc&state=all&search=rift&first_page_size=100)
## Test deployment
4. [x] Sentry does not show any new [unresolved issues on ~~test~~playground](https://sentry.io/organizations/ligo-caltech/issues/?environment=playground&groupStatsPeriod=14d&project=1425216&query=is%3Aunresolved&statsPeriod=14d) that indicate new bugs or regressions.
5. [x] The ~~test~~playground deployment has run for at least 10 minutes.
6. [x] The [Flower monitor](https://emfollow-playground.ligo.caltech.edu/flower) is reachable and shows no unexpected task failures.
7. [x] The [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) is reachable.
8. [x] The ~~test~~playground deployment is [connected to IGWN Alert](https://emfollow-playground.ligo.caltech.edu/flower/worker/gwcelery-worker%40emfollow-playground.ligo.caltech.edu#tab-other) (in Flower, find the main gwcelery-worker, click Other, and look at the list of subscribed IGWN Alert topics).
9. [x] The ~~test~~playground deployment is [connected to GCN](https://emfollow-playground.ligo.caltech.edu/flower/worker/gwcelery-voevent-worker%40emfollow-playground.ligo.caltech.edu#tab-other) (in Flower, find the voevent gwcelery-worker, click Other, and look at the list of receiver peers).
## Mock events
10. [x] The ~~test~~playground deployment has [produced an MDC superevent](https://gracedb-playground.ligo.org/latest/?query=MDC&query_type=S).
11. [x] The MDC superevent has the following annotations.
- [x] `bayestar.multiorder.fits`
- [x] `bayestar.fits.gz`
- [x] `bayestar.png`
- [x] `bayestar.volume.png`
- [x] `bayestar.html`
- [x] `p_astro.json`
- [x] `p_astro.png`
- [x] `em_bright.json`
- [x] `em_bright.png`
12. [x] The MDC superevent has the following labels.
- [x] `EMBRIGHT_READY`
- [x] `GCN_PRELIM_SENT`
- [x] `PASTRO_READY`
- [x] `SKYMAP_READY`
13. [x] The MDC superevent has two automatic preliminary VOEvents, JSON packets, and Avro packets if `GCN_PRELIM_SENT` is applied.
- [x] 2 preliminary VOEvents
- [x] 2 preliminary JSON packets
- [x] 2 preliminary Avro packets
14. [x] Issuing a manual preliminary alert from the [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) sends another preliminary alert.
- [ ] The alert **is sent** successfully if `ADVOK` or an `ADVNO` label is **not applied** this time.
- [x] Alternatively, a preliminary alert is **blocked** due to presence of `ADVOK` or `ADVNO`.
15. [x] `DQR_REQUEST` label is applied to the superevent. The application happens at the time of launching the second preliminary alert.
16. [x] The MDC superevent has either an `ADVOK` or an `ADVNO` label.
17. [x] Issuing an `ADVOK` signoff through GraceDB results in an initial VOEvent.
18. [x] Issuing an `ADVNO` signoff through GraceDB results in a retraction VOEvent.
19. [x] Requesting an update alert through the [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) results in an update VOEvent.
20. [x] ~~Test~~Playground has recently [produced an MDC superevent with an external coincidence](https://gracedb-playground.ligo.org/latest/?query=MDC+EM_COINC&query_type=S), i.e. with an `EM_COINC` label. Use the [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) to do this manually (note that joint events with Swift may not pass publishing conditions and or have a combined sky map, indicated by the lack of `RAVEN_ALERT` and `COMBINEDSKYMAP_READY` label respectively).
21. [x] The joint MDC superevent has the following annotations.
- [x] `coincidence_far.json`
- [x] `combined-ext.multiorder.fits` or `combined-ext.fits.gz`
- [x] `combined-ext.png`
- [x] `overlap_integral.png`
22. [x] The joint MDC superevent has the following labels.
- [x] `EM_COINC`
- [x] `RAVEN_ALERT`
- [x] `COMBINEDSKYMAP_READY`
- [x] `GCN_PRELIM_SENT`
23. [x] The joint MDC superevent is sending alerts with coincidence information.
- [x] At least one VOEvent with `<Group name="External Coincidence">`.
- [x] At least one Kafka JSON packet with an `external_coinc` field.
- [x] At least one circular w/ `-emcoinc-` in filename.
24. [x] Issue a manual RAVEN alert using the [Flask dashboard](https://emfollow-playground.ligo.caltech.edu/gwcelery) for a coincidence (i.e. has `EM_COINC` label) that has does not have the `RAVEN_ALERT` label yet. Choose a [recent joint coincidence that meets this criteria](https://gracedb-playground.ligo.org/latest/?query=MDC+%7ERAVEN_ALERT+%26+EM_COINC&query_type=S&get_neighbors=&results_format=) and ensure that a `RAVEN_ALERT` label is applied to the associated superevent, external event, and preferred event.
## Replay events
24. [x] [A Production superevent labeled `GCN_PRELIM_SENT`](https://gracedb-playground.ligo.org/latest/?query=Production+GCN_PRELIM_SENT&query_type=S&get_neighbors=&results_format=) has the following parameter estimation annotations and the `PE_READY` label.
- [x] `bilby_config.ini`
- [x] `Bilby.posterior_samples.hdf5`
- [x] `Bilby.multiorder.fits`
- [x] `Bilby.html`
- [x] `Bilby.fits.gz`
- [x] `Bilby.png`
- [x] `Bilby.volume.png`
- [x] `PE_READY`
- [x] Link to PEsummary page (log message in parameter estimation section)GWCelery v2.2.1 Releasehttps://git.ligo.org/emfollow/gwcelery/-/issues/764Follow-up from "deployment release docs"2024-02-22T18:35:54ZDeep Chatterjeedeep.chatterjee@ligo.orgFollow-up from "deployment release docs"The following discussions from !1363 should be addressed:
- [ ] @leo-singer started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1363#note_920068):
> The main branch must eventually also have a complete, ...The following discussions from !1363 should be addressed:
- [ ] @leo-singer started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1363#note_920068):
> The main branch must eventually also have a complete, linear change log history. How will you do that?
>
> I predict that we will have _many_ copy-paste errors in the change log if we try to do this manually. If you are really sure that you want to use this branching model, then there are automated changelog management tools that can help you do this accurately, such as https://github.com/changesets/changesets.
- [ ] @leo-singer started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1363#note_920069): (+1 comment)
> Are you going to preserve changelog entries for release candidates forever, or are you going to eventually merge together all of the RC changelogs once you have done a stable release?
- [ ] @leo-singer started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1363#note_920070):
> Each step should have a bold heading.
- [ ] @leo-singer started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1363#note_920072): (+1 comment)
> Yikes! You're going to create a release candidate every time the acceptance tests fail?
>
> Instead, how about you only create a release candidate once you are sure that the acceptance tests pass?
- [ ] @leo-singer started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1363#note_920074): (+1 comment)
> Easier said than done; the changelog will be a frequent source of merge conflicts. To follow this procedure, the person who is performing the release will need to be an expert in git merge conflict resolution.
>
> Instead, I suggest that you cherry-pick changelog entries from the _main_ branch onto the release branch.
- [ ] @leo-singer started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1363#note_920075): (+1 comment)
> On what branch(es)?
- [ ] @leo-singer started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1363#note_920076):
> Should codenames be for minor increments only? Or both minor and patch increments?
- [ ] @patrick.godwin started a [discussion](https://git.ligo.org/emfollow/gwcelery/-/merge_requests/1363#note_942761): (+5 comments)
> Looking at this MR and the relevant [issue](https://git.ligo.org/emfollow/gwcelery/-/issues/737), I have a few comments/suggestions on the release procedure. While not necessarily an issue, I personally think introducing release candidates and required release branches adds extra complexity without much value added.
>
> Regarding release candidates, I don't see what this adds compared to the previous procedure, except potentially not burning releases that don't pass acceptance tests or review. I'd argue that's a failure of the workflow, however. I'd expect most of the time that the release candidate will be functionally identical to the release, but adding an extra step along the way to change/consolidate the changelog and doubling the number of releases (or more) that will be published on PyPI. Why not stick with the previous procedure to only make a release once the acceptance tests pass?
>
> Regarding the release branches, one thing we can do that could simplify this is to only make a release branch if it's needed, and otherwise make a release directly off of the main branch (as was done previously). For example, feature releases can be made directly off of main. Bug fix releases might be made directly from main if they don't introduce new features. Only in the case where making a release from main would be an issue is where a release branch is really needed, in which case a release branch is created from the last release. One benefit from this is that main is not devoid of release tags and the automatic versioning scheme that this project uses shouldn't cause issues on the main branch.
>
> Anywho, not necessarily opposing to any of the big procedural changes proposed here. I'm pitching these ideas as a way to reduce friction in making new releases in the future.O4bhttps://git.ligo.org/emfollow/gwcelery/-/issues/763Update CBC O4b Trials Factor2024-02-27T11:11:00ZCody MessickUpdate CBC O4b Trials FactorWe need to confirm that our trials factors are correct once we know what pipelines are approved for O4b.We need to confirm that our trials factors are correct once we know what pipelines are approved for O4b.GWCelery v2.2.1 Releasehttps://git.ligo.org/emfollow/gwcelery/-/issues/760Too many open files" errors issue in Sentry. Problem related to the skymaps....2024-02-27T11:08:14ZRoberto DePietriToo many open files" errors issue in Sentry. Problem related to the skymaps.skymap_from_samples task multiprocessor queue."Too many open files" errors issue in Sentry https://ligo-caltech.sentry.io/issues/4432705056/?notification_uuid=4ba9419d-4605-4b32-9b4d-fe021e951965&project=1425216&referrer=weekly_report
It is repeated on a rate of order ~80 times a d..."Too many open files" errors issue in Sentry https://ligo-caltech.sentry.io/issues/4432705056/?notification_uuid=4ba9419d-4605-4b32-9b4d-fe021e951965&project=1425216&referrer=weekly_report
It is repeated on a rate of order ~80 times a day.
* "grep skymaps.skymap_from_samples \*log | grep succe | wc" > (352 success) **LAST** 2024-01-20 01:15:36
* "grep skymaps.skymap_from_samples \*log | grep "raised unexpected: OSError("| wc" > (failed > 1k times) **FIRST** 2024-01-20 01:24:10.
Affected version: GWCelery 2.1.10+12.ge218fc1e (ligo.skymap 1.1.2)GWCelery v2.2.1 ReleaseLeo P. SingerCody MessickDeep Chatterjeedeep.chatterjee@ligo.orgLeo P. Singer