GraceDB Server issueshttps://git.ligo.org/computing/gracedb/server/-/issues2019-03-08T20:53:50Zhttps://git.ligo.org/computing/gracedb/server/-/issues/117GWCelery issues with AWS implementation of GraceDB2019-03-08T20:53:50ZTanner PrestegardGWCelery issues with AWS implementation of GraceDBGWCelery has been seeing a few new errors since we moved the service to the AWS cloud:
* `SSLEOFError: Problem establishing secure connection: EOF occurred in violation of protocol (_ssl.c:777)`: [link](https://emfollow.ligo.caltech.edu...GWCelery has been seeing a few new errors since we moved the service to the AWS cloud:
* `SSLEOFError: Problem establishing secure connection: EOF occurred in violation of protocol (_ssl.c:777)`: [link](https://emfollow.ligo.caltech.edu/sentry/gwcelery/issues/299/?query=is:unresolved)
* `SSLError [SSL: SSL_HANDSHAKE_FAILURE] ssl handshake failure (_ssl.c:2217)`: [link](https://emfollow.ligo.caltech.edu/sentry/gwcelery/issues/305/)
* `ConnectionResetError [Errno 104] Connection reset by peer`: [link](https://emfollow.ligo.caltech.edu/sentry/gwcelery/issues/306/)https://git.ligo.org/computing/gracedb/server/-/issues/17Unit tests2022-08-03T18:05:38ZTanner PrestegardUnit testsThe unit tests are really lacking and are absolutely needed. Especially for authentication and permissions.The unit tests are really lacking and are absolutely needed. Especially for authentication and permissions.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/344Unsafe search response DB operations2024-03-27T23:36:06ZDaniel WysockiUnsafe search response DB operationsSentry reported an `IndexError` in `search.response.event_datatables_response` [here](https://ligo-caltech.sentry.io/issues/5107058502/?alert_rule_id=710526&alert_timestamp=1711566827464&alert_type=email&environment=production&notificati...Sentry reported an `IndexError` in `search.response.event_datatables_response` [here](https://ligo-caltech.sentry.io/issues/5107058502/?alert_rule_id=710526&alert_timestamp=1711566827464&alert_type=email&environment=production¬ification_uuid=abebba78-dd00-4d76-b6e1-5f05b8265faa&project=1456379&referrer=alert_email).
Looking into it, I've realized that [this call to `count()`](https://git.ligo.org/computing/gracedb/server/-/blob/77f15d0b34598612f347216aa0e323296b400fe3/gracedb/search/response.py#L348) performs a SQL [`SELECT COUNT(*)`](https://docs.djangoproject.com/en/4.2/ref/models/querysets/#django.db.models.query.QuerySet.count), which can then be outdated by the time we [iterate over a second query](https://git.ligo.org/computing/gracedb/server/-/blob/77f15d0b34598612f347216aa0e323296b400fe3/gracedb/search/response.py#L354). This all seems to be part of optimizations made in !163. I would consider reverting that MR, or if avoiding `list.append` is actually having a measurable performance benefit, using something like a pre-allocated 2D buffer array.Alexander PaceDaniel WysockiAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/342Smooth deployment on Kubernetes2024-03-15T14:57:45ZSara ValleroSmooth deployment on KubernetesThis is to upstream all the patches currently implemented for the deployment of gracedb-test01.igwn.org or a sandboxed deployment on Minikube.
- [ ] unauthenticated access to hopskotch (https://git.ligo.org/computing/gracedb/server/-/me...This is to upstream all the patches currently implemented for the deployment of gracedb-test01.igwn.org or a sandboxed deployment on Minikube.
- [ ] unauthenticated access to hopskotch (https://git.ligo.org/computing/gracedb/server/-/merge_requests/205)
- [ ] generic site name (https://git.ligo.org/computing/gracedb/server/-/merge_requests/206)
- [ ] username/password authSara ValleroSara Vallerohttps://git.ligo.org/computing/gracedb/server/-/issues/341only send igwn-alerts to the {group}_{pipeline} topic2024-03-13T20:29:24ZAlexander Paceonly send igwn-alerts to the {group}_{pipeline} topicHistorically `igwn-alert` and `LVAlert` sends out g-event and e-event alerts to topics with the `{group}_{pipeline}` and `{group}_{pipeline}_{search}` schema. As more pipelines and searches are added, topic management is becoming a pain ...Historically `igwn-alert` and `LVAlert` sends out g-event and e-event alerts to topics with the `{group}_{pipeline}` and `{group}_{pipeline}_{search}` schema. As more pipelines and searches are added, topic management is becoming a pain across all the GraceDB tiers, especially with the lack of a scriptable API to interact with SCIMMA.
I'm proposing to change the way GraceDB issues alerts to only send to `{group}_{pipeline}` topics and have users filter on search, if need be. It would have the benefit of simplifying topic management, and also save some milliseconds in dispatching alerts. The alert contents would remain the same. Superevent topics would be unaffected by this change.
Putting out feelers to `igwn-alert` stakeholders... @deep.chatterjee @cody.messick @nicolas.arnaud @rebecca.ewing would that break your listening processes, if so, would adding an extra filter based on the search in the alert content be too much of a technical burden?O4bAlexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/340Add support for validating read access against multiple token issuers2024-02-20T15:51:31ZJosh WillisAdd support for validating read access against multiple token issuersCurrently a single gracedb instance can only accept tokens from one-and-only-one token issuer. This will need to change to support the HTCondor 'local issuer' that we will be rolling out as an alternative to vault-managed scitokens.Currently a single gracedb instance can only accept tokens from one-and-only-one token issuer. This will need to change to support the HTCondor 'local issuer' that we will be rolling out as an alternative to vault-managed scitokens.https://git.ligo.org/computing/gracedb/server/-/issues/338Ingest GCN VOEvents from SVOM2024-02-23T20:30:01ZBrandon PiotrzkowskiIngest GCN VOEvents from SVOMRequested by @rachel.hamburg, we want to start ingesting VOEvents from SVOM as another `GRB` search.
Here's an example of a notice:
[sb23041100_eclairs-wakeup_2.xml](/uploads/ad0d71686252e09b6e623e1a11eaf0e2/sb23041100_eclairs-wakeup_2...Requested by @rachel.hamburg, we want to start ingesting VOEvents from SVOM as another `GRB` search.
Here's an example of a notice:
[sb23041100_eclairs-wakeup_2.xml](/uploads/ad0d71686252e09b6e623e1a11eaf0e2/sb23041100_eclairs-wakeup_2.xml)
This will require the following changes
- [ ] Add `pipeline='SVOM'` as mentioned in https://git.ligo.org/computing/gracedb/server/-/issues/255
- [ ] Add `SVOM` external events to IGWN alert
- [ ] Modify [translator](https://git.ligo.org/computing/gracedb/server/-/blob/master/gracedb/events/translator.py) to ingest the particular notice values, such as adding `"Burst_Id"` to get the ID and `"Exposure"` to get the duration.O4bhttps://git.ligo.org/computing/gracedb/server/-/issues/337Virgo O4b workflow2024-01-26T14:28:44ZMichael William CoughlinVirgo O4b workflowThere are ongoing discussions about how to use Virgo in LL for O4b.
See DAC issue here: https://git.ligo.org/dac/preparations-for-using-virgo-data-in-o4b-low-latency-analyses/-/issues/1
See gwcelery issue here: https://git.ligo.org/em...There are ongoing discussions about how to use Virgo in LL for O4b.
See DAC issue here: https://git.ligo.org/dac/preparations-for-using-virgo-data-in-o4b-low-latency-analyses/-/issues/1
See gwcelery issue here: https://git.ligo.org/emfollow/gwcelery/-/issues/749
It is possible that some changes to GraceDB will be needed to enable this.O4bhttps://git.ligo.org/computing/gracedb/server/-/issues/336service dies with out-of-date gpstime package2024-01-10T12:05:13ZAlexander Paceservice dies with out-of-date gpstime packageI started a rolling restart of gracedb-playground and when it came back online, it 503'ed with the following error:
```
Traceback (most recent call last):
File "/app/gracedb_project/manage.py", line 44, in <module>
execute_from_co...I started a rolling restart of gracedb-playground and when it came back online, it 503'ed with the following error:
```
Traceback (most recent call last):
File "/app/gracedb_project/manage.py", line 44, in <module>
execute_from_command_line(sys.argv)
File "/usr/local/lib/python3.9/dist-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python3.9/dist-packages/django/core/management/__init__.py", line 395, in execute
django.setup()
File "/usr/local/lib/python3.9/dist-packages/django/__init__.py", line 24, in setup
apps.populate(settings.INSTALLED_APPS)
File "/usr/local/lib/python3.9/dist-packages/django/apps/registry.py", line 114, in populate
app_config.import_models()
File "/usr/local/lib/python3.9/dist-packages/django/apps/config.py", line 301, in import_models
self.models_module = import_module(models_module_name)
File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 790, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/app/gracedb_project/gracedb/alerts/models.py", line 19, in <module>
from .phone import get_twilio_from
File "/app/gracedb_project/gracedb/alerts/phone.py", line 12, in <module>
from events.permission_utils import is_external
File "/app/gracedb_project/gracedb/events/permission_utils.py", line 9, in <module>
from .models import Event
File "/app/gracedb_project/gracedb/events/models.py", line 31, in <module>
from gpstime import gpstime
File "/usr/local/lib/python3.9/dist-packages/gpstime/__init__.py", line 41, in <module>
from .leaps import LEAPDATA
File "/usr/local/lib/python3.9/dist-packages/gpstime/leaps.py", line 187, in <module>
LEAPDATA = LeapData()
File "/usr/local/lib/python3.9/dist-packages/gpstime/leaps.py", line 126, in __init__
self._load(fetch_ietf_leapfile, LEAPFILE_IETF_URL)
File "/usr/local/lib/python3.9/dist-packages/gpstime/leaps.py", line 136, in _load
raise RuntimeError(f"Error loading leap file {path}: {str(e)}")
RuntimeError: Error loading leap file https://www.ietf.org/timezones/data/leap-seconds.list: 404 Client Error: Not Found for url: https://www.ietf.org/timezones/data/leap-seconds.list
```
uhhhh does it not seem dangerous to anyone else to depend on an external file like that which just breaks the package.
I brought playground down to one container, and manually upgraded `gpstime` (from 0.6.2) to the latest version (0.8.1, https://git.ligo.org/computing/sccb/-/issues/1397) and that fixed it.
What this means is that production is waiting to break if it restarts for any reason. I'm going to upgrade `gpstime`, build new containers and redeploy while the detectors are offline.https://git.ligo.org/computing/gracedb/server/-/issues/334Expanded API calls for analytics2023-11-13T17:55:16ZAlexander PaceExpanded API calls for analyticsFrom an email chain with @andrew.toivonen, @michael-coughlin, @sushant.sharma-chaudhary:
```
Alex,
Following up on your email, we had a discussion as a group about what GraceDB API changes could be useful.
For some context, these are...From an email chain with @andrew.toivonen, @michael-coughlin, @sushant.sharma-chaudhary:
```
Alex,
Following up on your email, we had a discussion as a group about what GraceDB API changes could be useful.
For some context, these are the scripts (and what they fetch) that we have used in the past to fetch from GraceDB/GraceDB Playground:
Playground:
All MDC events (from a range of gpstimes): https://git.ligo.org/emfollow/em-properties/mdc-analytics/-/blob/main/fetch_data/events_from_gracedb.py
MDC Skymaps: https://git.ligo.org/emfollow/em-properties/mdc-analytics/-/blob/main/fetch_data/fetch_skymaps.py
MDC Posterior Samples (from a range of gpstimes): https://git.ligo.org/emfollow/em-properties/mdc-analytics/-/blob/main/fetch_data/fetch_all_PE.py
GraceDB
All data products from a superevent: https://git.ligo.org/emfollow/em-properties/mdc-analytics/-/blob/main/fetch_data/fetch_superevent.py
Posterior Samples from a single event: https://git.ligo.org/emfollow/em-properties/mdc-analytics/-/blob/main/fetch_data/fetch_PE.py
GCN latencies: https://git.ligo.org/emfollow/em-properties/mdc-analytics/-/blob/main/fetch_data/fetch_O4_gcn.py
First off, if you feel any of these scripts are poorly optimized feel free to let us know. This brings me to my next thought, we know that bulk fetching from Playground for the MDC is very resource intensive and has caused issues in the past. I however think there will always be a need for bulk fetching when it comes to the MDC, simply due to the nature of the study and the numerous triggers. Part of the strain was also caused due to the fact that we did not fetch in an optimized manner (and maybe our method could be optimized event further), so one possible addition to the API would be adding a call to fetch a table of all event quantities as we did, yet done how you would optimize such a query. The same could be said for event data products, such as PE and skymaps. We were maybe wondering if there was a way to add a call that would simply download a file, without having to save it or a list of files as an object?
As for fetching from GraceDB, I think in general our studies will be focused on specific or a small subset of events. What could be most useful would be a call to download the latest skymap or latest posterior samples for a given event. Finally, I know latency was added to the GraceDB page, how is that latency defined? And is there an easy way to fetch that value? Fetching all the latencies for a range of gpstimes or just the entire observing run would be useful as well. Maybe it would also be good to include the ability to fetch all superevents, or just significant ones.
These were our initial thoughts without a great idea of which of these are most easily implemented and would make a difference.
Let us know what you think,
Andrew
```https://git.ligo.org/computing/gracedb/server/-/issues/333search feature not working2023-11-13T06:25:19ZKipp Cannonsearch feature not working## Description of problem
Search feature is not working
## Expected behavior
Go to https://gracedb.ligo.org/search/ and enter "S231020", select "Superevent", click "Search". Only one entry appears, labelled "S231020a". But there we...## Description of problem
Search feature is not working
## Expected behavior
Go to https://gracedb.ligo.org/search/ and enter "S231020", select "Superevent", click "Search". Only one entry appears, labelled "S231020a". But there were many events that day, including "S231020bw", which is the one I was trying to find. Entering "S231020bw" into the search term produces the desired entry. If the search is not implicitly a wild-card search, why does the "a" event appear? If it is a wild-card search, why doesn't the "bw" event appear?
## Steps to reproduce
See above.
## Context/environment
My web browser.
## Suggested solutions
Fix the search feature, or modify the Query Help page to explain how to properly do wild-card searches. Thanks.https://git.ligo.org/computing/gracedb/server/-/issues/332number of log annotations on S190412m causes browser requests to hit timeout2023-10-16T15:00:31ZAlexander Pacenumber of log annotations on S190412m causes browser requests to hit timeoutAttempting to load the internal page for [S190412m](https://gracedb.ligo.org/superevents/S190412m/view/) results in a timeout because the time to retrieve the number of log entries on that event exceeds the 30 second timeout in gunicorn....Attempting to load the internal page for [S190412m](https://gracedb.ligo.org/superevents/S190412m/view/) results in a timeout because the time to retrieve the number of log entries on that event exceeds the 30 second timeout in gunicorn. Note that I had previously implemented a check for a maximum number of log messages to display for g-events (in response to RAVEN repeatedly annotating external events for years on end), but this check never got ported over to superevents. @roberto.depietri brought this up on the [emfollow dev call](https://git.ligo.org/emfollow/gwcelery/-/wikis/telcons/2023-10-16) this morning.
Okay, so what is it about this superevent, and who's writing all those log messages? I went into the database console to see where all the annotations were coming from and I believe they were from the `detchar` user, who annotated the superevent 877 times:
```
In: m = Superevent.get_by_date_id('S190412m')
In: m.log_set.exclude(comment__contains='Tagged message').filter(issuer=detchar).count()
Out: 877
```
This user accessed GraceDB with one of the following certificate subjects back in 2019.
And it looks like there was a server error of some sort server error (not related to GraceDB as far as I can tell) that prevented the upload of some data products from being uploaded because the `Detchar` log messages are mostly ones like these:
```
2019-04-12 05:33:01.508952+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:33:00.555024+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:32:59.357731+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:32:58.161597+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:32:57.160221+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:32:56.204876+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:32:55.161672+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:32:54.276859+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:32:06.365500+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:32:04.341545+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
2019-04-12 05:32:03.398589+00:00Attempted upload of 'L1ligocam-S190412m.json' failed due to server issues [message edited by administrator]
```
I've attached the timestamp and comment of each one of the detchar log messages to this issue. [S190412m-detchar-errors.txt](/uploads/3cf5e528934a4af0d2d5af6498531875/S190412m-detchar-errors.txt)
I'll go ahead and implement the maximum log messages error for superevents. @roberto.depietri, if there's anything else you need to help interrogate this 4-year old superevent, please let me know.Alexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/331Unable to receive notifications2023-10-23T01:37:15ZElenna CapoteUnable to receive notificationsI am signed up to receive email alerts from graceDB, and it worked reliably well throughout the summer. Now, for the path month or so, I haven't received any alerts, although I know we are detecting many events.
I am signed up for alerts...I am signed up to receive email alerts from graceDB, and it worked reliably well throughout the summer. Now, for the path month or so, I haven't received any alerts, although I know we are detecting many events.
I am signed up for alerts tagged as "ADVREQ", which is what I was told was correct. I also tested my notifications and I receive the test email fine. The email address is my gmail address. I confirmed the test email arrives in my inbox, and I check my spam filter when event alerts go out and I have not mistakenly received the email to my spam.https://git.ligo.org/computing/gracedb/server/-/issues/330Queries based on SNR?2023-10-17T02:40:44ZKeita KawabeQueries based on SNR?It seems that [queries based on SNR are not supported](https://gracedb.ligo.org/documentation/queries.html). I learned this when I tried to quickly find [S230814h aka snr>40 event](https://gracedb.ligo.org/superevents/S230814ah/view/), b...It seems that [queries based on SNR are not supported](https://gracedb.ligo.org/documentation/queries.html). I learned this when I tried to quickly find [S230814h aka snr>40 event](https://gracedb.ligo.org/superevents/S230814ah/view/), but I can imagine that this would be useful, even if that's just for satisfying my curiosity ;)Daniel WysockiDaniel Wysockihttps://git.ligo.org/computing/gracedb/server/-/issues/329GraceDB uploads from the lensing pipelines2024-01-22T15:22:29ZIan HarryGraceDB uploads from the lensing pipelinesI was asked to move this discussion here from an email thread. I'll copy/paste the emails into here one by one, starting with the top-level description/problem statement:
We've been discussing on the PyCBC end about the deployment of th...I was asked to move this discussion here from an email thread. I'll copy/paste the emails into here one by one, starting with the top-level description/problem statement:
We've been discussing on the PyCBC end about the deployment of the O4a
lensing search pipeline, and there was a question about GraceDB
interaction, which I wanted to bring to some experts. I hope I'm
reaching the right GraceDB experts here (alongside the GstLAL lensing
search leads, and search chairs in CC), but let me know if I'm missing
anyone.
Just as an overview/reminder. The lensing searches are run as a
followup to known CBC triggers. They perform a focused search on a
narrow range of parameters around the values obtained from the known
event. Motivation is that there might be a lensed event which appears
as two "images" on Earth, one with SNR > 8 and one with SNR < ~8. The
first "image" can be found by our standard all-sky searches, but the
second might only be extracted if we use information from the first
image.
Practically this means that we will have a set of search triggers for
*every* CBC candidate at *all* times in O4, from both GstLAL and
PyCBC.
These searches will be recovering *other* known BBH events, and given
a bulk of events around mchirp ~ 40, we will likely have some events
recovered in *multiple* lensing searches.
The question is how would we process this in GraceDB? Would we be
uploading all triggers (above some threshold?) to GraceDB? This has
the potential to make some superevents quite confusing on internal
views if there are numerous lensed triggers alongside the numerous
online and offline all sky triggers. What search
tags/columns/names/whatever would be used? Has GstLAL already got a
plan in place for doing this? Any other thoughts?
Thanks!
Ianhttps://git.ligo.org/computing/gracedb/server/-/issues/328DQR link in the RRT view should be linked to DQR 5-minutes tier URL2023-10-03T12:11:49ZKeita KawabeDQR link in the RRT view should be linked to DQR 5-minutes tier URLIn the RRT view for S-event, "Data quality report" is linked to https://ldas-jobs.ligo.caltech.edu/~dqr/o4dqr/online/events/YYYYMM/SYYMMDDabcd/ e.g. https://ldas-jobs.ligo.caltech.edu/~dqr/o4dqr/online/events/202309/S230927l/.
Responders...In the RRT view for S-event, "Data quality report" is linked to https://ldas-jobs.ligo.caltech.edu/~dqr/o4dqr/online/events/YYYYMM/SYYMMDDabcd/ e.g. https://ldas-jobs.ligo.caltech.edu/~dqr/o4dqr/online/events/202309/S230927l/.
Responders have to open the link, click "tasks by tier" and click "5 min" to open a different URL in the form of https://ldas-jobs.ligo.caltech.edu/~dqr/o4dqr/online/events/YYYYMM/SYYMMDDabcd/5_min_tier_index.html, e.g. https://ldas-jobs.ligo.caltech.edu/~dqr/o4dqr/online/events/202309/S230927l/5_min_tier_index.html.
Since the only tasks RRT shifters are interested in are under the 5-minutes tier anyway, the link should point to https://ldas-jobs.ligo.caltech.edu/~dqr/o4dqr/online/events/YYYYMM/SYYMMDDabcd/5_min_tier_index.html.https://git.ligo.org/computing/gracedb/server/-/issues/327"Public events" overview must show Bilby skymap graphics not Bilby.multiorder...2023-09-28T19:12:52ZKeita Kawabe"Public events" overview must show Bilby skymap graphics not Bilby.multiorder.fits files["Public Events" overview page](https://gracedb.ligo.org/superevents/public/O4/) usually shows a thumbnail image of the skymap.
When PE update is sent out(??), Bilby.multiorder.fits file is linked instead and it appears to the users as ...["Public Events" overview page](https://gracedb.ligo.org/superevents/public/O4/) usually shows a thumbnail image of the skymap.
When PE update is sent out(??), Bilby.multiorder.fits file is linked instead and it appears to the users as if the link is broken, see attached. This should be changed to Bilby.png,0 etc.
![Screenshot_2023-09-10_at_13.49.53](/uploads/e605a14cfcfb0f2e358bb685874a1c01/Screenshot_2023-09-10_at_13.49.53.png)https://git.ligo.org/computing/gracedb/server/-/issues/326GraceDB popup before confirming the advocate signoff2023-09-01T00:17:20ZKeita KawabeGraceDB popup before confirming the advocate signoffBefore making/changing/deleting Advocate Signoff, "are you really sure?" type dialog box should be displayed. See the comment of @nicolas.arnaud in https://git.ligo.org/emfollow/followup-advocate-guide/-/issues/91:
> Yesterday we had the...Before making/changing/deleting Advocate Signoff, "are you really sure?" type dialog box should be displayed. See the comment of @nicolas.arnaud in https://git.ligo.org/emfollow/followup-advocate-guide/-/issues/91:
> Yesterday we had the case of a Lv0 shifter who used the advocate signoff interface on the wrong superevent... While what follows won't prevent this from happening again, I would suggest adding to the section https://emfollow.docs.ligo.org/followup-advocate-guide/procedures1.html#sign-off-okay-on superevent (and possibly to https://emfollow.docs.ligo.org/followup-advocate-guide/procedures1.html#sign-off-not-okay-on-superevent as well) the fact that, when pressing the signoff button, the action is not immediate. Instead, a popup window appears with an appropriate message (depending on what the action will be)
> > You are attempting to create an Advocate Sign-Off, which will generate a public alert. Do you wish to continue?
> > You are attempting to update an Advocate Sign-Off. Do you wish to continue?
> > You are attempting to delete an Advocate Sign-Off. Do you wish to continue?
>And, at this stage, the shifter should really pause for a few seconds, review what they are about to do and (only) then press OK or cancel.https://git.ligo.org/computing/gracedb/server/-/issues/325Add S-event ID to the advocate signoff popup windows2023-09-11T02:39:25ZNicolas ArnaudAdd S-event ID to the advocate signoff popup windowsMeaning: replacing the first sentence of the popup messages (three different ones I think)
> You are attempting to create/update/delete an Advocate Sign-Off (...)
by
> You are attempting to create/update/delete an Advocate Sign-Off **...Meaning: replacing the first sentence of the popup messages (three different ones I think)
> You are attempting to create/update/delete an Advocate Sign-Off (...)
by
> You are attempting to create/update/delete an Advocate Sign-Off **for Superevent SYYMMDD<abc>** (...)
That would give advocates one more chance to check they are about to signoff for the right event.https://git.ligo.org/computing/gracedb/server/-/issues/324Traefik returns 502 when Gunicorn restarts2023-10-04T23:11:51ZAlexander PaceTraefik returns 502 when Gunicorn restartsHere's the scenario: GraceDB will instantly (no 30 second timeout) return a 502 proxy error to the client, then the client code retries and everything works.
Further investigation will show that there's no errors in the gracedb (django...Here's the scenario: GraceDB will instantly (no 30 second timeout) return a 502 proxy error to the client, then the client code retries and everything works.
Further investigation will show that there's no errors in the gracedb (django/gunicorn) logs. but there will be one in the webgateway/traefik logs, ex:
```
# grep -n '" 502 ' *.log
gracedb-swarm-production-us-west-2c-docker-mgr-01.log:49112:Aug 1 13:45:34 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_webgateway_webgateway.3.91ly1u1voskgcnyo78u2t0xq1: 131.215.113.150 - - [01/Aug/2023:13:45:34 +0000] "POST /api/events/ HTTP/1.1" 502 11 "-" "-" 613542 "gracedb@docker" "http://10.0.1.56:80" 1ms
```
At the same time there's a block in gracedb's logs like:
```
Aug 1 13:45:34 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_gracedb_gracedb.1.we7ztc1fre4kje94k14lz3cy4: GUNICORN | [2023-08-01 13:45:34 +0000] [3589] [INFO] Autorestarting worker after current request.
Aug 1 13:45:34 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_gracedb_gracedb.1.we7ztc1fre4kje94k14lz3cy4: GUNICORN | [2023-08-01 13:45:34 +0000] [3589] [INFO] Worker exiting (pid: 3589)
Aug 1 13:45:35 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_gracedb_gracedb.1.we7ztc1fre4kje94k14lz3cy4: GUNICORN | [2023-08-01 13:45:35 +0000] [11186] [INFO] Booting worker with pid: 11186
Aug 1 13:45:35 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_gracedb_gracedb.1.we7ztc1fre4kje94k14lz3cy4: GUNICORN | [2023-08-01 13:45:35 +0000] [11186] [INFO] Worker spawned (pid: 11186)
```
Automatic restarting is controlled [here](https://git.ligo.org/computing/gracedb/server/-/blob/81847bbf401c99dabd36d39d66aab5f95deae6d3/config/gunicorn_config.py#L74-86) and [here](https://git.ligo.org/computing/gracedb/deployment/-/blob/0cd096d8230e9a01dadeeed66609d8939dc1129c/swarm-stacks/gracedb-prod-stack.yml#L100-101), and it used to avoid possible memory leaks. As far as I can tell, this restart/502 hasn't actually affected low latency operations, as gwcelery has retried and succeeded each time. `pycbclive` did ping about a 502 twice (2023-07-24 15:21:20 UTC on playground and 2023-07-29 10:48:19 on prod), but as far as I can tell the request was subsequently retried by the client code and succeeded.
Possible solutions could be...? to do nothing, since clients are retrying and succeeding. We could also try increasing the maximum number of requests, and the jitter to see we can space them out and make it less frequent.