GraceDB Server issueshttps://git.ligo.org/computing/gracedb/server/-/issues2022-08-03T18:04:27Zhttps://git.ligo.org/computing/gracedb/server/-/issues/16Refurbish events API2022-08-03T18:04:27ZTanner PrestegardRefurbish events APIThe events API needs to be redone for a few reasons:
1. Incomplete validation and error handling
2. Difficult to implement permissions - redoing this would make #15 much easier
3. Many redundancies and inefficiencies
4. Doesn't make...The events API needs to be redone for a few reasons:
1. Incomplete validation and error handling
2. Difficult to implement permissions - redoing this would make #15 much easier
3. Many redundancies and inefficiencies
4. Doesn't make use of the builtin features in django-rest-framework
One possible difficulty is that some changes might require corresponding client changes, so we might run into yet another case where we have another server-client incompatibility.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/17Unit tests2022-08-03T18:05:38ZTanner PrestegardUnit testsThe unit tests are really lacking and are absolutely needed. Especially for authentication and permissions.The unit tests are really lacking and are absolutely needed. Especially for authentication and permissions.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/19Improve server stability and performance2022-08-03T18:10:24ZTanner PrestegardImprove server stability and performanceStarted on October 12, 2017, copied from redmine (https://bugs.ligo.org/redmine/issues/5946)
We have a generic goal of attempting to improve GraceDB's stability and performance. Ideally, it should be able to handle a significant load (l...Started on October 12, 2017, copied from redmine (https://bugs.ligo.org/redmine/issues/5946)
We have a generic goal of attempting to improve GraceDB's stability and performance. Ideally, it should be able to handle a significant load (lots of automated processes triggering and querying after an event is identified) and provide reasonably fast performance (page loading, API queries, etc.). But we really need a more precise definition of what we want out of the server. A specific issue that we would like to rectify is the gateway timeout issue - it's been reduced, but not removed.
Some ideas of things we can do:
Significant profiling and rewriting of code - reduce memory footprint and number of database queries. We should use select_related and prefetch_related wherever we can.
Improve web UI performance - web pages shouldn't take as long to load, should cache files, etc.
Switch to PostgreSQL
Use gUnicorn with Apache as a reverse proxy - allows us to eliminate mod_wsgi plugin and hopefully boost performance
Possible issues:
We don't have a standard way of measuring performance. Note: unit tests might help with that.
We don't have a good way to imitate the production environment for load testing.O4 Prephttps://git.ligo.org/computing/gracedb/server/-/issues/21Introduce type-ahead- or tab-completion-like features to the GraceDB search2022-08-03T18:14:45ZTanner PrestegardIntroduce type-ahead- or tab-completion-like features to the GraceDB searchStarted on May 9, 2014 by Branson. Copied from redmine (https://bugs.ligo.org/redmine/issues/1337)
From a conversation with Fan, Erik, and Patrick on May 8, 2014.
Fan suggested a type-ahead feature, as in Google search. You start typin...Started on May 9, 2014 by Branson. Copied from redmine (https://bugs.ligo.org/redmine/issues/1337)
From a conversation with Fan, Erik, and Patrick on May 8, 2014.
Fan suggested a type-ahead feature, as in Google search. You start typing, and the event list is narrowed down as you go, before your very eyes.
I pointed out that this might be difficult, as we can't load huge numbers of events into a datastore in order to facilitate this.
Patrick suggested that even keyword completion feature would be really useful. If you start typing 'Te...', GraceDB could fill in 'Test' by looking at her lexicon of keywords.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/22Overhaul of search feature2022-08-03T18:16:59ZTanner PrestegardOverhaul of search featureStarted on April 15, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5432)
The search feature really needs to be redone. There are several requests for new features (#1337, #2175, #3543, #5052) and the code (gracedb/quer...Started on April 15, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5432)
The search feature really needs to be redone. There are several requests for new features (#1337, #2175, #3543, #5052) and the code (gracedb/query.py) is really clunky. There is also a serious lack of consistency regarding when logical operators, quotes, keywords, etc. can/should be used.
Ideas from Patrick:
define a "language" for the search and STICK TO IT. Can get ideas from Google, other search syntaxes.
get feedback from users on any commonly used searches (primarily by automated systems) in order to make sure they don't break with the update (may have to break them, we'll see)
could be similar to natural language processing
expand search capabilities beyond what we have now, including the ability to search by mass, other parameters
improve overall architecture
think about design, understand uses, make a ~1 page write-up describing your planBackloghttps://git.ligo.org/computing/gracedb/server/-/issues/23GraceDB search suggestions2022-08-03T18:17:49ZTanner PrestegardGraceDB search suggestionsFrom Brian O'Reilly, started on January 24, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5052)
The search help does not reflect the actual functionality of the search. It may be that the search functions as expected a...From Brian O'Reilly, started on January 24, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5052)
The search help does not reflect the actual functionality of the search. It may be that the search functions as expected and one cal only use relational operators in restricted ways, but this isn't clear. Also the layout of the search results returned from the "Search" tab is different from what you see if you do a search from the "Latest" tab.
For example, "far<1e-6 ~INJ" works but the help seems to indicate that the syntax should be "far<1e-6 & ~INJ"
"H1OK | L1OK & ~INJ & ~DQV" works but there doesn't seem to be any way to add a FAR cut, e.g. "far<1e-6", to this
query.
You can see all results from a particular pipeline removing injections and adding a FAR cut: "cwb far<1e-7 ~inj", but from the help one expects the syntax to be "cwb & far<1e-7 & ~inj"
It would be nice to be able to remove a given pipeline from the search results. "~cwb" for example does not work.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/24Search for hardware injection by id H208670 misinterpreted2022-08-03T18:18:19ZTanner PrestegardSearch for hardware injection by id H208670 misinterpretedStarted on February 2, 2016 by Branson. Copied from redmine (https://bugs.ligo.org/redmine/issues/3543)
The query parser strips off the 'H2' and interprets it as an instrument search. The remaining '08670' is interpreted as an integer, ...Started on February 2, 2016 by Branson. Copied from redmine (https://bugs.ligo.org/redmine/issues/3543)
The query parser strips off the 'H2' and interprets it as an instrument search. The remaining '08670' is interpreted as an integer, and therefore a gpstime query.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/25Allow negative searches2022-08-03T18:19:12ZTanner PrestegardAllow negative searchesStarted on June 3, 2015 by Branson. Copied from redmine (https://bugs.ligo.org/redmine/issues/2175)
> On Jun 3, 2015, at 10:42 AM, Salvatore Vitale <salvatore.vitale@ligo.mit.edu> wrote:
>
> Hi Branson,
>
> Is there a way to perform a...Started on June 3, 2015 by Branson. Copied from redmine (https://bugs.ligo.org/redmine/issues/2175)
> On Jun 3, 2015, at 10:42 AM, Salvatore Vitale <salvatore.vitale@ligo.mit.edu> wrote:
>
> Hi Branson,
>
> Is there a way to perform a negative query on graceDB. E.g. I tried
>
> search: !MDC pipeline: cwb
>
> but that doesn't work. I tried a few other syntaxes (not, !=, <>) but none seems to work
>
> Thanks,
> salvoBackloghttps://git.ligo.org/computing/gracedb/server/-/issues/32Clean up reports page2022-08-03T18:30:42ZTanner PrestegardClean up reports pageCreated by Patrick Brady on July 22, 2015. Copied from redmine (https://bugs.ligo.org/redmine/issues/2313)
The GraceDB reports pages needs to be cleaned up. This means removing some things, moving some things to other places, and/or pro...Created by Patrick Brady on July 22, 2015. Copied from redmine (https://bugs.ligo.org/redmine/issues/2313)
The GraceDB reports pages needs to be cleaned up. This means removing some things, moving some things to other places, and/or providing access to some of the information via a simple query to GraceDB.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/37Update front-end package manager configuration2022-08-03T18:33:50ZTanner PrestegardUpdate front-end package manager configurationCreated on December 14, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/6053)
We currently use `bower` for managing a small set of front-end CSS and JS packages. We should fix this by creating a package.json or bower.jso...Created on December 14, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/6053)
We currently use `bower` for managing a small set of front-end CSS and JS packages. We should fix this by creating a package.json or bower.json file in the server code repository so this is self-contained. There should also be some instructions (at least on Gitlab) for how to set up the repository, including running bower to install the packages.
We may also need to move to something other than bower. I get the following message when installing bower:
```
root@gracedb-test:~# npm install -g bower
npm WARN deprecated bower@1.8.2: ...psst! Your project can stop working at any moment because its dependencies can change. Prevent this by migrating to Yarn: https://bower.io/blog/2017/how-to-migrate-away-from-bower/
/usr/bin/bower -> /usr/lib/node_modules/bower/bin/bower
+ bower@1.8.2
updated 1 package in 3.776s
```
We can maybe move to yarn? Need to look into this more, see the above link.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/44Unify how file versions are handled2022-08-03T18:42:31ZTanner PrestegardUnify how file versions are handledSome resources return files without a version even if a version was requested, and vice versa. This should be standardized so that it is the same everywhere!Some resources return files without a version even if a version was requested, and vice versa. This should be standardized so that it is the same everywhere!Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/47Adding KAGRA events to GraceDB2022-03-31T15:39:21ZTanner PrestegardAdding KAGRA events to GraceDBCreated by Alex on January 28, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5071)
The purpose of this ticket is to track the future development of the uploading and sharing protocol for events from KAGRA. The below po...Created by Alex on January 28, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5071)
The purpose of this ticket is to track the future development of the uploading and sharing protocol for events from KAGRA. The below points came as a result of a face-to-face conversation at the University of Tokyo on January 28, 2017. Open issues include:
1. ~~How to compartmentalize KAGRA events from LIGO events? Options include (but are not limited to):
Standing up a separate KAGRA GraceDB instance. KAGRA will use separate authentication to restrict access, but access to coincident LIGO events would be unavailable.~~
~~Using the existing GraceDB infrastructure, but restricting access to events to users with a separate KAGRA authentication, e.g., only KAGRA users can have access to events uploaded by KAGRA.~~
2. ~~Event data-exchange between LIGO and KAGRA users. Scenarios include events that are determined to be significant due to LIGO-KAGRA coincidence; what data about the event would be available to LIGO/KAGRA members?~~
3. ~~If and how to restrict KAGRA uploads in the age of public LIGO data? This is an open issue for VIRGO events as well.~~
4. Event data format. What will be the format of events uploaded to GraceDB? Would there need to be modification to GraceDB's data parser to accept KAGRA events?
Please add any relevant parties to this conversation, as needed. Relevant watchers for the ticket should be:
* Nobuyuki Kanda <kanda@sci.osaka-cu.ac.jp>
* Hideyuki Tagoshi <tagoshi@sci.osaka-cu.ac.jp>
---
**Updating this ticket, September 15 2021:**
As of today, KAGRA members have:
1. Access to GraceDB via X509 certificates (https://git.ligo.org/computing/helpdesk/-/issues/506)
2. Access to GraceDB via Shibboleth (https://git.ligo.org/lscsoft/gracedb/-/issues/186)
To my best understanding of the MOU and how it's implemented now, KAGRA members have equal upload and access privileges as LSC members. So the previous discussion regarding separation of GraceDB instances and restricting data access seems to be moot. That leaves the event data format.
This is just a matter of getting an example event upload that has entries for KAGRA's contribution. So, part of the `instruments` column, an additional `sngl_inspiral` table, etc. Once pipelines have an example event upload ("Sample Event?" in the table below), then I can upload and fix GraceDB's upload parser. The things I need to test are:
- Does the event file get ingested into GraceDB without error?
- Are the various event properties and and table entries parsed and input into the database? Are the KAGRA-relevent columns ingested? For example, does it recognize a `K1` instrument column; is KAGRA's `sngl_inspiral` table in the db?
- Are the tables legible and formatted correctly on the event's landing page? Is the data visible?
- Is the KAGRA data returned as part of the LVAlert and event `HTTPResponse` packet?
I started the table below to track the progress.
| Pipeline | Sample Event? | Upload Correctly? | Parse Correctly? | View Correctly? | LVAlert Contents | Link |
| --- | --- | --- | --- | --- | --- | --- |
|`CWB` | :x: | :x: | :x: | :x: | :x: | |
|`gstlal` | :white_check_mark: | :white_check_mark: | :white_check_mark:| :white_check_mark: | :white_check_mark: | [G153205](https://gracedb-test.ligo.org/events/G153205/view/) |
|`MBTAOnline` | :x: | :x: | :x: | :x: | :x: | |
|`oLIB` | :x: | :x: | :x: | :x: | :x: | |
|`pycbc` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | [G153215](https://gracedb-test.ligo.org/events/G153215/view/) |
|`spiir` | :x: | :x: | :x: | :x: | :x: | |
Ahead of O4, I will also need a list of KAGRA members who will need to upload new events, and to what pipelines.O4 Infrastructure ImprovementsAlexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/49Add CSRF protection2022-08-03T18:43:49ZTanner PrestegardAdd CSRF protectionCreated by Alex on April 18, 2016. Copied from redmine (https://bugs.ligo.org/redmine/issues/4038)
There has been interest expressed in implementing cross-site request forgery (CSRF) protection on GraceDB:
https://docs.djangoproject.co...Created by Alex on April 18, 2016. Copied from redmine (https://bugs.ligo.org/redmine/issues/4038)
There has been interest expressed in implementing cross-site request forgery (CSRF) protection on GraceDB:
https://docs.djangoproject.com/ja/1.9/ref/csrf/
This isn't a bug or an urgent feature request; I'm just documenting this for later.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/63Fix the way instruments are stored for events2022-08-03T18:49:25ZTanner PrestegardFix the way instruments are stored for eventsCreated August 16, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5694)
Instruments are currently associated with events by a string like "H1,L1" or "H1,L1,V1". This is an ineffective way of doing it and prevents effici...Created August 16, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5694)
Instruments are currently associated with events by a string like "H1,L1" or "H1,L1,V1". This is an ineffective way of doing it and prevents efficient instrument-based queries.
We should create an instruments model and just have a many-to-many relationship with events (may need to create a go-between like "labelling").
I think there is also an 'ifos' variable: we should resolve the redundancy issue if that's the case.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/65Don't reapply labels that already exist2022-08-03T18:59:03ZTanner PrestegardDon't reapply labels that already existCreated on August 25, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5714)
Labels that already exist on an event should not be able to be applied. Annoying when ADVOK is applied multiple times and you get multiple phone...Created on August 25, 2017. Copied from redmine (https://bugs.ligo.org/redmine/issues/5714)
Labels that already exist on an event should not be able to be applied. Annoying when ADVOK is applied multiple times and you get multiple phone/text alerts.
Points to consider:
* Should an XMPP alert be sent out if, for example, the advocate signoff label was already ADVOK, but the comment is just changed? Should it still be an "alert for label" or just an update?
* We definitely shouldn't send out any alerts for "normal" labels like INJ, DQV, etc. if they're reapplied.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/66Load testing2022-08-03T18:59:50ZTanner PrestegardLoad testingCreated on February 5, 2018. Copied from redmine (https://bugs.ligo.org/redmine/issues/6087)
Starting a ticket to define a procedure for load testing. General overview:
* Collect/design some utilities for load testing and monitoring th...Created on February 5, 2018. Copied from redmine (https://bugs.ligo.org/redmine/issues/6087)
Starting a ticket to define a procedure for load testing. General overview:
* Collect/design some utilities for load testing and monitoring the server
* Define how we will evaluate the performance of the server
* Discuss procedure with CGCA admins and EM follow-up group and iterateO4 Prephttps://git.ligo.org/computing/gracedb/server/-/issues/75Moving old detections to superevents and making them public2022-08-03T19:01:51ZTanner PrestegardMoving old detections to superevents and making them publicWe want to make old events like GW170817 into superevents and available publicly.
The steps will be:
1. Aggregate the events into superevents
2. Copy over relevant content (logs, files, etc.)
* Metadata: do we need log submitters, t...We want to make old events like GW170817 into superevents and available publicly.
The steps will be:
1. Aggregate the events into superevents
2. Copy over relevant content (logs, files, etc.)
* Metadata: do we need log submitters, timestamps, etc. to match that of the original event? Patrick: No
3. Add new information like samples files which are included for these events on GWOSC.
4. Internal vetting and approval.
5. Public release
Steps 1 and 2 will likely be handled by superevent_manager/gwcelery.
Questions are:
* What kind of timescale are we working on?
* Catalog paper and engineering run are expected near the end of November. Can we get this done on a similar timescale?
* Maybe - need to talk to Tom about AWS/public implementation planO4 CBC Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/88Fully implement public "switch"2022-08-03T19:03:28ZTanner PrestegardFully implement public "switch"Fully implement the settings variable which will allow or not allow unauthenticated access. Should also have unit tests which test this capability.Fully implement the settings variable which will allow or not allow unauthenticated access. Should also have unit tests which test this capability.Backlog2018-11-30https://git.ligo.org/computing/gracedb/server/-/issues/97Rolling deletion of Test events and superevents2022-08-03T19:04:41ZTanner PrestegardRolling deletion of Test events and supereventsTest events currently make up 46% of the events in the database and take up 25% of the storage. We should not be preserving Test events indefinitely (on principle) and it could help speed things up a bit. I would like to establish a ni...Test events currently make up 46% of the events in the database and take up 25% of the storage. We should not be preserving Test events indefinitely (on principle) and it could help speed things up a bit. I would like to establish a nightly cron job that deletes events and superevents older than 3 months or so.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/111Handling "Page not found (404)"2022-08-03T19:08:55ZStuart AndersonHandling "Page not found (404)"Currently an attempt to access a deep link, e.g., https://gracedb-playground.ligo.org/superevents/S181203f/view/, without an active login session returns 404 "Page not found" and the message "No Superevent matches the given query."
Plea...Currently an attempt to access a deep link, e.g., https://gracedb-playground.ligo.org/superevents/S181203f/view/, without an active login session returns 404 "Page not found" and the message "No Superevent matches the given query."
Please consider enhancing the 404 page to conditionally indicate (if there is no active Shibboleth session) that authorized users should first try logging in, and provide a login hyperlink to do so.
For bonus points, see if there is an easy way for users that select the login link (and successfully authenticate) to automatically have their browsers reload the originally requested page.
Note, for users with a valid Shibboleth session and still land at a URL with no valid page, please also consider changing messages like No Superevent matches the given query." to "No Superevent matches the given query or you are not authorizes to view it."Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/112Filenames can't contain commas2022-08-04T01:22:48ZTanner PrestegardFilenames can't contain commasBecause the versioning method adds `,#` to the end and we rely on splitting on a comma. So we need to check this on all models with filenames and everywhere files can be uploaded (event creation/replacement specifically). It might be a...Because the versioning method adds `,#` to the end and we rely on splitting on a comma. So we need to check this on all models with filenames and everywhere files can be uploaded (event creation/replacement specifically). It might be automatically handled in the superevents API, but I'm not sure.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/116Add log filtering by tag2022-08-04T01:24:41ZTanner PrestegardAdd log filtering by tagAdd a filter to the logs API where only logs with certain tags applied will be retrieved. Would need a client update as well.Add a filter to the logs API where only logs with certain tags applied will be retrieved. Would need a client update as well.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/122Open alerts should be public in GraceDb2022-08-04T01:25:41ZLeo P. SingerOpen alerts should be public in GraceDbIt's useful to be able to download old VOEvents. We should make this possible for anonymous GraceDb users. Please add the `public` tag to any VOEvent that is created with `internal=False`.It's useful to be able to download old VOEvents. We should make this possible for anonymous GraceDb users. Please add the `public` tag to any VOEvent that is created with `internal=False`.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/128Require FAR to be non-negative2022-08-04T01:28:20ZTanner PrestegardRequire FAR to be non-negativePipelines submitting with negative FAR is probably a sign that something is wrong. It is also triggering alerts with FAR thresholds since a negative FAR is obviously less than any reasonable FAR threshold.Pipelines submitting with negative FAR is probably a sign that something is wrong. It is also triggering alerts with FAR thresholds since a negative FAR is obviously less than any reasonable FAR threshold.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/132tooltip time conversion2022-08-04T01:31:07ZStuart Andersontooltip time conversionConsider having the tooltip window convert GPS to UTC and vice-a-versa, i.e., if a user hovers their cursor over a table element that contains a GPS time have the pop-up window show that same time in UTC (and vice-a-versa).Consider having the tooltip window convert GPS to UTC and vice-a-versa, i.e., if a user hovers their cursor over a table element that contains a GPS time have the pop-up window show that same time in UTC (and vice-a-versa).Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/136Failed to receive phone call for S190408an2022-08-04T01:40:28ZBrian O'ReillyFailed to receive phone call for S190408anI received a text message for this event but my phone did not ring. I have my alert set for call and text based on the ADVREQ label.
My number is in the US, 225 area code.I received a text message for this event but my phone did not ring. I have my alert set for call and text based on the ADVREQ label.
My number is in the US, 225 area code.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/148Visually distinguish private vs public information2022-08-04T01:43:26ZStuart AndersonVisually distinguish private vs public informationConsider adding an option to visually distinguish private vs public information for privileged users while they are logged in, e.g., different background color or a water mark overlay. Note, if that is too visually distracting there coul...Consider adding an option to visually distinguish private vs public information for privileged users while they are logged in, e.g., different background color or a water mark overlay. Note, if that is too visually distracting there could be a toggle button, e.g., "highlight public" or "highlight private", to enable an inline comparison of public vs private information.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/152Mutually exclusive labels2022-08-04T01:44:26ZStuart AndersonMutually exclusive labelsAs discussed in https://git.ligo.org/emfollow/gwcelery/merge_requests/500 consider making some sets of GraceDB labels mutually exclusive so that clients (e.g., gwcelery) can change which one is set in a single transaction, e.g., setting ...As discussed in https://git.ligo.org/emfollow/gwcelery/merge_requests/500 consider making some sets of GraceDB labels mutually exclusive so that clients (e.g., gwcelery) can change which one is set in a single transaction, e.g., setting DQOK would automatically unset DQV.Backloghttps://git.ligo.org/computing/gracedb/server/-/issues/158Gracedb should parse FITS files2019-06-07T15:04:34ZJonah KannerGracedb should parse FITS files## Description of feature request
Add some parsing of FITS files included in VOEvents into gracedb. This could be done at the time a voevent is created, or later by a cronjob. There are several outcomes that people have suggested for ...## Description of feature request
Add some parsing of FITS files included in VOEvents into gracedb. This could be done at the time a voevent is created, or later by a cronjob. There are several outcomes that people have suggested for this:
* Create PNG of 2-D skymaps
* Create PNG of 3-D skymaps
* Create PNG of p-astro info
* Ingest distance info into schema of each voevent
## Use cases
* Could display PNG of skymaps with predictable and reliable filename
* Could enhance views of gracedb events to show these PNGs in a structured way
* Could include distance info in tables listing sets of events, so that people could sort or filter based on this property
## Benefits
Currently, the PNG files are not uploaded with predictable names, and so are not useful. This would allow PNG files to be more reliable and useful in views. Adding distance info to schema would allow people to search and sort by distance.
## Drawbacks
* Distance information includes 2 numbers, so represents a significant change to the schema for VOEvents
* If this ingestion is done at the time a new VOEvent is created, could slow down the process
of sending out GCNs, which is a serious drawback. For this reason, may be better to create the VOEvent in the database without this, and then 'back-fill' with a cronjob or similar.
## Suggested solutions
If it were me, I would:
* Add a distance and distance error field to the VOEvent schema. These fields would be 'null' at the time of VoEvent creation, and would not appear in the VoEvent XML files (as now).
* Set up a cronjob that runs every 10 minutes or so, and does the following:
* Gets a list of all VOEvents where the distance column is 'null'
* For each of these, generate PNG images of skymaps, and save to disk in a predictable location
* Read out distance information from FITS header, and ingest into database.https://git.ligo.org/computing/gracedb/server/-/issues/161User's favourites or Followed Events list2019-07-09T22:07:10ZNicola De Lillonicola.delillo@ligo.orgUser's favourites or Followed Events list## Description of feature request
Insert a tabin between the already present tabs LATEST and ALERTS. This tab could be called "FOLLOWED EVENTS" (Not PREFERRED, it would confuse people with "preferred events" in case of a super-events).Th...## Description of feature request
Insert a tabin between the already present tabs LATEST and ALERTS. This tab could be called "FOLLOWED EVENTS" (Not PREFERRED, it would confuse people with "preferred events" in case of a super-events).The tab would link to the page "FOLLOWED EVENTS" which would appear really as it looks now the LATEST page. The difference is that the entry showed in "FOLLOWED EVENTS" page will show only the events or supervents flagged as "FOLLOWED" by the user.
Events should be flag-able either from the SEARCH page or from the LATEST page.
## Use cases
1) internal use for LIGO: It is useful for experts ROTA or advocates or PE-Rotaers that can easily keep track of the events they got assigned.
2) In general any scientist interested in a particular event can follow it. i.e.: if I am interested in tracking the analysis for only Binary Neutron stars system, I would flag as FOLLOW only that events.
## Benefits
I think it helps really tracking the events a scientist want to follow.
## Drawbacks
Not really any drawbacks at the moment as far as I can see. Apart that one has to redesign the toolbar adding the "FOLLOWED EVENTS" tab.
## Suggested solutions
Here attached a figure that shows how to display (the example is for LATEST page, BUt consider please doing that also for the SEARCH page) the boxes to flag in the FOLLOW column. All the flagged events (either a flag or dot or a filling color) will go in the FOLLOWED EVENT page. Of course rememer to add a 'X' symbol in the FOLLOWED EVENT page if one wants to remove that event.
![LATEST_examp](/uploads/4b8a8e328fff42644b539cc6ee63d061/LATEST_examp.png)https://git.ligo.org/computing/gracedb/server/-/issues/162Show component masses in Superevent preview2019-07-09T23:22:30ZNicola De Lillonicola.delillo@ligo.orgShow component masses in Superevent preview## Description of feature request
It is a web interface change. It would require to add both in the LATEST both in the SEARCH page, the column "MASS1" and "MASS2" (or simply M1 M2) for CBC events/superevent raws showed in both lists. I t...## Description of feature request
It is a web interface change. It would require to add both in the LATEST both in the SEARCH page, the column "MASS1" and "MASS2" (or simply M1 M2) for CBC events/superevent raws showed in both lists. I think the values should come from the preferred event analyses and then update every time the analysts upload a new parameter estimation result that they consider more appropriate.
## Use cases
Evertime looking at latest or DB
## Benefits
It is useful to spot quickly interesting and appealing events for any scientist in the field that are scrolling the DB.
## Drawbacks
NOt any in mind at the moment
## Suggested solutions
It could simply be one more column in the raws shoed both in LATEST and in the SEARCH lists.https://git.ligo.org/computing/gracedb/server/-/issues/163What's on? - Most viewed / trending event page2019-07-10T01:54:43ZNicola De Lillonicola.delillo@ligo.orgWhat's on? - Most viewed / trending event page## Description of feature request
Add a page (with a tab to reach that in the tab-bar) that shows a list of every gravitational wave events detected, but sorted by visualisation (i.e. every time a user logs to the event page). The page c...## Description of feature request
Add a page (with a tab to reach that in the tab-bar) that shows a list of every gravitational wave events detected, but sorted by visualisation (i.e. every time a user logs to the event page). The page could look like a list with simple and clean entries that might be of interested of less regular visitors:
SUPEREVENT_ID PREFERRED_EVENT DATE_TIME FAR MASS1 MASS2 VISUALIZED
Of course one must implement a visualisation counter to produce the VISUALIZED Column
It would be also nice if this page can have a switch button at the top of the the list, between "MOST VISUALIZED" and "TRENDING" event. While "MOST VISUALIZED" would refer to the above list, "TRENDING" list would show the most visualized event at the moment (might be in the last 1 week or 1 mont, two months or 6 months - even better if one can select this for option to shape the list).
The tab name to redirect to this page could be any. At the moment I can suggest "What's on" , "Hall of fame", "Most viewed". But further research and ideas could come later.
## Use cases
Not everybody want to follow or receive all of the alerts. The DB could appear overwhelming for people outside the collaboration of the GW field. Some scientists (both outside and inside the gravitational wave field) might only be interested casually in what is going on with detections. This way they can now follow which one were the most interested events ever or which ones get all the hype in some specific period.
## Benefits
It would keep hyped people inside and outside the gravitational wave field and make the GraceDB more user-friendly and appealing for casual visitors.
## Drawbacks
Not anyone at the moment I can think of
## Acknowledgement
Thank you to Jennifer Wright, trending event idea was from her.https://git.ligo.org/computing/gracedb/server/-/issues/165Add source frame masses and redshift2019-07-15T14:47:51ZPatrick BradyAdd source frame masses and redshift## Description of feature request
<!--
Describe your feature request!
Is it a web interface change? Some underlying feature? An API resource?
The more detail you can provide, the better.
-->
Albert Lazzarini asked: Is it possible to add ...## Description of feature request
<!--
Describe your feature request!
Is it a web interface change? Some underlying feature? An API resource?
The more detail you can provide, the better.
-->
Albert Lazzarini asked: Is it possible to add information to the internal graceDB pages? The additional information that would be helpful is to clearly indicate that the masses listed are in the DETECTOR frame (see screen shot. Also, using the Standard Model, the redshift Z associated with the luminosity distance could also be provided.
## Use cases
<!-- List some specific cases where this feature will be useful -->
Would be useful in both superevents and regular events
## Benefits
<!-- Describe the benefits of adding this feature -->
Would make it easier to know source frame masses now that we are seeing mergers to cosmological distances.
## Drawbacks
<!--
Are there any drawbacks to adding this feature?
Can you think of any ways in which this will negatively affect the service for any set of users?
-->
Needs reasonable parameter estimation.
## Suggested solutions
<!-- Do you have any ideas for how to implement this feature? -->https://git.ligo.org/computing/gracedb/server/-/issues/166Some notifications missing for S190718y2019-07-23T00:46:37ZNicolas ArnaudSome notifications missing for S190718yHi @tanner.prestegard
I made a quick poll this morning during the weekly Virgo DetChar meeting about the receiving of S190718y notifications. There were about 15 people attending that meeting. The results are
* Almost nobody received a...Hi @tanner.prestegard
I made a quick poll this morning during the weekly Virgo DetChar meeting about the receiving of S190718y notifications. There were about 15 people attending that meeting. The results are
* Almost nobody received an e-mail notification.
* About half of the people did not receive the call/text notification they had subscribed to.
Thanks in advance for investigating that issue -- I don't think we had this problem earlier in the run.
Nicolashttps://git.ligo.org/computing/gracedb/server/-/issues/169firefox fails to download S180814bv bayestar.fits.gz, but OK with bayestar.fi...2019-08-15T19:31:16ZKeita Kawabefirefox fails to download S180814bv bayestar.fits.gz, but OK with bayestar.fits.gz,1I cannot imagine that astronomers are using firefox for followup purposes so this might not be an urgent issue, but I'd appreciate if somebody could investigate.
During RRT for S190814bv, I reported that the size of bayestar.fits.gz VS ...I cannot imagine that astronomers are using firefox for followup purposes so this might not be an urgent issue, but I'd appreciate if somebody could investigate.
During RRT for S190814bv, I reported that the size of bayestar.fits.gz VS bayestar.fits.gz,1 were totally different, and Geoffrey Mo concurred.
After one day the problem still persists for me, but it seems like firefox just fails to download the whole bayestar.fits.gz (it can download bayestar.fits.gz,1).
wget, curl and chrome all worked fine.
The symptom is that using firefox,
https://gracedb.ligo.org/api/superevents/S190814bv/files/bayestar.fits.gz
gives me a smaller file than it should be.
When I restart firefox and download again the file size changes a bit, but it's always ~710000 bytes when the files downloaded by curl, wget or chrome are all identical and are 4563601 bytes.
Combinations of restarting firefox, clearing browser cache, making a new user profile, going into private browser mode, deleting ./mozilla folder and using different computers (one osx, one linux) didn't help.
In the example below, 'Bayestar (1).fits.gz' was downloaded by chrome, everything else was downloaded using firefox. I can read bayestar.fits.gz,1 and 'bayestar (1).fits.gz' using astropy.
```
(base) ~/Downloads$ ls -l bayestar*
-rw-r--r--@ 1 keita.kawabe staff 4563601 Aug 15 11:03 bayestar (1).fits.gz
-rw-rw-rw-@ 1 keita.kawabe staff 711742 Aug 15 11:02 bayestar.fits(1).gz
-rw-rw-rw-@ 1 keita.kawabe staff 711742 Aug 15 11:04 bayestar.fits(2).gz
-rw-rw-rw-@ 1 keita.kawabe staff 711870 Aug 15 11:05 bayestar.fits(3).gz
-rw-rw-rw-@ 1 keita.kawabe staff 711772 Aug 15 11:07 bayestar.fits(4).gz
-rw-rw-rw-@ 1 keita.kawabe staff 711698 Aug 15 11:29 bayestar.fits(5).gz
-rw-rw-rw-@ 1 keita.kawabe staff 711698 Aug 15 11:30 bayestar.fits(6).gz
-rw-rw-rw-@ 1 keita.kawabe staff 711771 Aug 15 11:31 bayestar.fits(7).gz
-rw-rw-rw-@ 1 keita.kawabe staff 711731 Aug 15 08:50 bayestar.fits.gz
-rw-rw-rw-@ 1 keita.kawabe staff 4563601 Aug 15 08:51 bayestar.fits.gz,1
(base) ~/Downloads$ diff bayestar.fits.gz,1 bayestar\ \(1\).fits.gz
(base) ~/Downloads$
```https://git.ligo.org/computing/gracedb/server/-/issues/171Bad file versioning for event replacement2019-09-18T15:28:42ZTanner PrestegardBad file versioning for event replacementWhen replacing an event, there a few problems with file versioning:
* The file version is *not* passed from the API view (`api.v1.events.EventList.put`) to `events.translator.handle_uploaded_data` (see [here](https://git.ligo.org/lscsof...When replacing an event, there a few problems with file versioning:
* The file version is *not* passed from the API view (`api.v1.events.EventList.put`) to `events.translator.handle_uploaded_data` (see [here](https://git.ligo.org/lscsoft/gracedb/blob/master/gracedb/api/v1/events/views.py#L622)). This isn't a problem **only** if the new event file has a different filename.
* `handle_uploaded_data` assumes a file version of 0 for both the new event file **and** the generated files (`event.log`, `coinc.xml`), which is just plain wrong - version 0 of these files was already generated when the event was initially created.https://git.ligo.org/computing/gracedb/server/-/issues/172Remove `skymap_type` for VOEvents2019-09-18T17:51:14ZTanner PrestegardRemove `skymap_type` for VOEventsThe `python3` branch includes a few minor changes to the VOEvent file format, including the fact that there won't be any need for the `skymap_type` any longer. We should remove it from the API endpoints for creating VOEvents. I'm not s...The `python3` branch includes a few minor changes to the VOEvent file format, including the fact that there won't be any need for the `skymap_type` any longer. We should remove it from the API endpoints for creating VOEvents. I'm not sure if you want to remove it from the `VOEvent` models - that would remove it for past VOEvents, which might be non-ideal. At least put a comment on the model noting that it's a legacy field.https://git.ligo.org/computing/gracedb/server/-/issues/174Potential character set issue2019-09-20T15:35:58ZTanner PrestegardPotential character set issueThe development and playground databases should have the correct character sets and collations due to how they were created by Puppet. But the production database was created so long ago that it looks like it has the `latin1` character ...The development and playground databases should have the correct character sets and collations due to how they were created by Puppet. But the production database was created so long ago that it looks like it has the `latin1` character set by default.
It's not posing a problem at present since we have a migration which manually sets the `auth_user` table to use utf8, but I think it would be a good idea to set the database default character set and collation when an opportunity arises.
We'll have to get MySQL command-line access to the production database, then run:
```
ALTER DATABASE <dbname> CHARACTER SET utf8 COLLATE utf8_general_ci;
```
Might be worth testing this (I haven't) and taking a snapshot before doing so.https://git.ligo.org/computing/gracedb/server/-/issues/178Implement server-side copying and linking.2019-10-04T17:49:45ZAlexander PaceImplement server-side copying and linking.## Description of feature request
Currently the workflow for adding files (such as skymaps) from `G` events to superevents involves downloading the file from GraceDB, and then re-uploading it to the superevent. This workflow evolved out ...## Description of feature request
Currently the workflow for adding files (such as skymaps) from `G` events to superevents involves downloading the file from GraceDB, and then re-uploading it to the superevent. This workflow evolved out of the inability make items attached to G-events public. Short of drastically changing the server permissions infrastructure, it would be a lot easier to make the copy on the server-side. This might involve symlinking to the existing file, since they're the same anyway? I'll dig into that.
## Use cases
These files are copied by the orchestrator from the preferred event into the superevent before the preliminary GCN is sent out.
## Benefits
This reduces the network traffic back-and-forth between GWCelery and GraceDB. It also reduces the overall number of API calls to GraceDB.
## Drawbacks
I have to be vigilant about respecting GraceDB's versioning system. I also wonder how, if I implement symlinking whether or not the access controls will be respected. GWCelery needs to decide on a filename scheme (like including the gid in filenames in the case where identical files are copied from different events).
## Suggested solutions
After discussing with the group, I think the implementation should look something like:
```
gracedb.migrateFiles(origin_id, destination_id, file_list)
* origin_id (string) - the event or superevent ID of the file origin
* desination_id (string)- the event or superevent ID of the file destination
* origin_filename (tuple or list of tuples) - tuple of strings where the format is (orgin_file_name.ext,version , destination_file_name.ext)
```
Other things to keep in mind:
* there should be logging on the side of the destination event that says the filename (with version), and the origin event.
* there should be logging on the origin event like, "file XXXX copied to SXXXXX"
* This should use GraceDB's existing file ingestion mechanism so the current versioning
This is (obviously) going to require a corresponding change to the API. Also make sure that LVAlerts are sent out for file copies. There should be an LVAlert sent out for the sending/receiving events to the corresponding (event/superevent) nodes. Alexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/179Remove Classification and Inference sections for burst VOEvents2019-10-25T13:49:19ZLeo P. SingerRemove Classification and Inference sections for burst VOEventsSee emfollow/gwcelery#246.See emfollow/gwcelery#246.https://git.ligo.org/computing/gracedb/server/-/issues/180Collect better usage statistics2019-10-28T17:00:42ZAlexander PaceCollect better usage statisticsI had a request from Chad if it was reasonable to collect better usage statistics from gracedb and lvalert. This seems totally doable, but would just require some parsing/archiving from the gunicorn access logs. The archiving schema is c...I had a request from Chad if it was reasonable to collect better usage statistics from gracedb and lvalert. This seems totally doable, but would just require some parsing/archiving from the gunicorn access logs. The archiving schema is changing with the new deployment, but it shouldn't be that hard to implement after docker-swarm comes up and running.
Note that this ticket has some overlap with #176, so maybe this can be a "two-birds" thing once I get it running.Alexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/181Inconsistent URLs for logs, tags, and voevents, for events and superevents2019-11-15T00:56:19ZLeo P. SingerInconsistent URLs for logs, tags, and voevents, for events and supereventsEvents and superevents have inconsistent URLs for tags and logs. This is confusing and reduces the opportunities for code reuse in client code.
Events use `log`, `tag`, and `voevent`, **singular**:
* `https://gracedb-test.ligo.org/api/...Events and superevents have inconsistent URLs for tags and logs. This is confusing and reduces the opportunities for code reuse in client code.
Events use `log`, `tag`, and `voevent`, **singular**:
* `https://gracedb-test.ligo.org/api/events/{graceid}/log/`
* `https://gracedb-test.ligo.org/api/events/{graceid}/log/{N}/tag/`
* `https://gracedb-test.ligo.org/api/events/{graceid}/voevent/`
Whereas superevents use `logs`, `tags`, and `voevents`, **plural**:
* `https://gracedb-test.ligo.org/api/superevents/{superevent_id}/logs/{N}`
* `https://gracedb-test.ligo.org/api/superevents/{superevent_id}/logs/{N}/tags/`
* `"https://gracedb-test.ligo.org/api/superevents/{superevent_id}/voevents/`https://git.ligo.org/computing/gracedb/server/-/issues/182Ongoing problem of missing notifications from GraceDB2019-11-15T19:24:58ZLeo P. SingerOngoing problem of missing notifications from GraceDBSee original issue, [emfollow/O3break#32](https://git.ligo.org/emfollow/o3break/issues/32).See original issue, [emfollow/O3break#32](https://git.ligo.org/emfollow/o3break/issues/32).https://git.ligo.org/computing/gracedb/server/-/issues/183Pause button for the notifications in https://gracedb.ligo.org/alerts/2019-11-17T07:11:23ZNicolas ArnaudPause button for the notifications in https://gracedb.ligo.org/alerts/## Description of feature request
It would be great if there were a "Pause" button in the "Notifications" section of https://gracedb.ligo.org/alerts/ in addition to "Edit" and "Delete". Currently to deactivate a notification one has to d...## Description of feature request
It would be great if there were a "Pause" button in the "Notifications" section of https://gracedb.ligo.org/alerts/ in addition to "Edit" and "Delete". Currently to deactivate a notification one has to delete it. Setting it to "pause" would allow to keep it deactivated for a while. That would ease the re-enabling of that notification at a later time.
## Use cases
Debug/test/engineering run notifications that one would like to only keep active for a given time period but that could be activated again at a later time.
People on rota may want to have particular notifications enabled only when they are on duty.
## Benefits
No need to set notification(s) again, with the risk to make typos.
Have examples of notifications that worked in the past and could be reused, either as such or slightly modified.
## Drawbacks
People could click the "Pause" button by mistake or forget it had been set for a given notification. So clicking on "Pause" should trigger an "Are you sure?" popup and deactivated notifications should be clearly identified on the Alerts GraceDB page: different font/color/etc.
## Suggested solutions
Condition blocks to be added to the existing code!?https://git.ligo.org/computing/gracedb/server/-/issues/187likelihood value is truncated for cwb events2020-02-06T02:19:08ZAlexander Pacelikelihood value is truncated for cwb eventsI got an email this morning from Edoardo Milotti:
```
Hi Alex,
I noticed a small problem in the events uploaded by cWB on the playground. The field that is
identified as “likelihood” in the text files (a floating point number) is used...I got an email this morning from Edoardo Milotti:
```
Hi Alex,
I noticed a small problem in the events uploaded by cWB on the playground. The field that is
identified as “likelihood” in the text files (a floating point number) is used by GraceDB
to calculate the “SNR” field in the event page: SNR = sort(likelihood). It seems that the
underlying code treats “likelihood” as an integer and the SNR result differs from the expected
one, can you please check? A quick check shows that the same problem exists for real events in GraceDB.
Thank you.
Best, Edoardo
```
I dug into it on production and playground, and it would appear that he's (kind of) right. Likelihood is still a float value, but it's being truncated when the event file is read in.
For example, [this event](https://gracedb-playground.ligo.org/events/G240121/view/) on playground. The uploaded file has the following value for likelihood:
```
...
...
likelihood: 1.503621e+02
...
...
```
But in the database, it shows it as:
```
In [1]: from events.models import Event
In [2]: a = Event.getByGraceid('G240121')
In [3]: a
Out[3]: <MultiBurstEvent: G240121>
In [4]: a.likelihood
Out[4]: 150.0
```
The result of this is that the snr that is reported on the event page is slightly incorrect from the expected value, since GraceDB computes snr=sqrt(likelihood) for CWB events ([code is here](https://git.ligo.org/lscsoft/gracedb/blob/master/gracedb/events/translator.py#L473)).
Also, as far as I can tell, this behavior has been in place since before even O1. For instance, here's the first cwb event of O1: https://gracedb.ligo.org/events/G185420/view/
I've spot checked other values for cwb and coincinspiral events, and the data looks right but I have to inspect it more thoroughly.Alexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/188cWB skymap not found on the "public alerts" GraceDB page2020-01-15T12:13:58ZNicolas ArnaudcWB skymap not found on the "public alerts" GraceDB pageThe root cause of this is probably outside GraceDB and I apologize if this has already been discussed elsewhere -- or is being addressed at the moment. For the recent cWB public alert, https://gracedb.ligo.org/superevents/public/O3/ disp...The root cause of this is probably outside GraceDB and I apologize if this has already been discussed elsewhere -- or is being addressed at the moment. For the recent cWB public alert, https://gracedb.ligo.org/superevents/public/O3/ displays
> S200114f (...) No public skymap image found.
likely because the Bayestar sky maps are labelled "cWB" instead of "bayestar": https://gracedb.ligo.org/apiweb/superevents/S200114f/files/cWB.fits.gz. While fixing the naming convention for the cWB skymaps (that may be on purpose), a test could be added to GraceDB to look for cWB.fits.gz if bayestar.fits.gz is not found.https://git.ligo.org/computing/gracedb/server/-/issues/189Improve caching in django2020-06-17T02:45:49ZAlexander PaceImprove caching in djangoThe settings exist (https://git.ligo.org/lscsoft/gracedb/blob/master/config/settings/base.py#L242) in gracedb's configuration for memcached caching, which is a quick and easy way to cache webpage views. This will be particularly helpful ...The settings exist (https://git.ligo.org/lscsoft/gracedb/blob/master/config/settings/base.py#L242) in gracedb's configuration for memcached caching, which is a quick and easy way to cache webpage views. This will be particularly helpful when new events come in and hundreds of users start hitting the website. Even some modest caching (I don't know what that means yet? 10 seconds? 30 seconds?) would greatly reduce server load and prevent django from making db queries and rendering new templates every time someone goes to the website or hits "reload".
A couple of issues:
* The configuration is set up for `memcached` caching in memory, but the `memcached` daemon isn't actually running or installed on any of the development machines or AWS containers. I installed it manually on `gracedb-dev2` and it started memcaching almost immediately. Almost.
* The `MIDDLEWARE` section of `config/settings/base.py` needs to be edited to look like this:
```
# List of middleware classes to use.
MIDDLEWARE = [
'core.middleware.maintenance.MaintenanceModeMiddleware',
'events.middleware.PerformanceMiddleware',
'core.middleware.accept.AcceptMiddleware',
'core.middleware.api.ClientVersionMiddleware',
'core.middleware.api.CliExceptionMiddleware',
'django.middleware.cache.UpdateCacheMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.cache.FetchFromCacheMiddleware',
'core.middleware.proxy.XForwardedForMiddleware',
'user_sessions.middleware.SessionMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'ligoauth.middleware.ShibbolethWebAuthMiddleware',
'ligoauth.middleware.ControlRoomMiddleware',
]
```
Note that the order apparently matters.
One other complication: it occurred to me that we're running a docker swarm of nodes, and yeah, each one has plenty of memory. However, they won't be able to access each others' local memory cache. Hmmm. I can run some tests and monitor the memory and caching on each node, but it doesn't seem efficient.
Last thing:
Amazon offers something called memcached "Elasticache", which appears to be a shared memory cache for different nodes:
* https://aws.amazon.com/elasticache/memcached/
It seems to be what we're looking for. Also this requires a new django backend:
* https://pypi.org/project/django-elasticache/
So I'm guessing the process is going to look like:
1) Log into AWS and find out how to make a new elasticache partition for each one of the different tiers. This can probably be automated with ansible, but at first I'll just click through the web interface like a caveman.
2) Modify `requirements.txt` to install `django-elasticache`.
3) Modify `config/settings/container/base.py` to include the elasticache stuff under `CACHES`. The address is going to be different, but that can be automated with a deployment environment variable in the docker swarm deployment yml.
4) Modify `MIDDLEWARE` to include the django-elasticache middleware. I'm not sure what this will look like exactly, but it should probably model the block that I pasted up there.
Other useful links:
https://docs.djangoproject.com/en/3.0/topics/cache/
https://www.tutorialspoint.com/django/django_caching.htm
https://devcenter.heroku.com/articles/django-memcacheAlexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/191apache improvements2020-01-29T17:52:33ZAlexander Paceapache improvementsMaking notes here about ways to optimize connections via apache. I was reading through some sources about apache worker concurrency (since CPU usage seems to be a bottleneck? at least that's an operative theory), here's an informative on...Making notes here about ways to optimize connections via apache. I was reading through some sources about apache worker concurrency (since CPU usage seems to be a bottleneck? at least that's an operative theory), here's an informative one:
https://serverfault.com/questions/775855/how-to-configure-apache-workers-for-maximum-concurrency
So I need to find out which modules are being loaded and what settings are being used. Here's the setup for `dev2` and `playground` (AWS).
`gracedb-dev2`:
```
root@gracedb-dev2:/etc/apache2# apachectl -M | grep mpm
mpm_worker_module (shared)
```
```
root@gracedb-dev2:/etc/apache2# cat mods-enabled/worker.conf
<IfModule mpm_worker_module>
ServerLimit 25
StartServers 2
ThreadLimit 64
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
ListenBacklog 511
</IfModule>
```
`gracedb-playground`:
```
root@17b0c88a4c4f:/etc/apache2# apachectl -M | grep mpm
mpm_event_module (shared)
```
```
root@17b0c88a4c4f:/etc/apache2# cat mods-enabled/mpm_event.conf
# event MPM
# StartServers: initial number of server processes to start
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestWorkers: maximum number of worker threads
# MaxConnectionsPerChild: maximum number of requests a server process serves
<IfModule mpm_event_module>
StartServers 2
MinSpareThreads 25
MaxSpareThreads 75
ThreadLimit 64
ThreadsPerChild 25
MaxRequestWorkers 150
MaxConnectionsPerChild 0
</IfModule>
# vim: syntax=apache ts=4 sw=4 sts=4 sr noet
```
So that's a start at least. They look to be the same (I'm assuming the default...), but i have some knobs to tweak potentially. I'll look into doing some testing with [ab](https://httpd.apache.org/docs/2.4/programs/ab.html) and [siege](https://www.joedog.org/siege-home/) and see what i can come up with.https://git.ligo.org/computing/gracedb/server/-/issues/192Two suggestions for the "neighbors" table on G event pages2020-02-26T21:53:15ZTito Dal CantonTwo suggestions for the "neighbors" table on G event pagesThe "neighbors" table would be much more useful with the following features:
* A column showing the network SNR for CBC events
* The ability to click on a column to sort the table according to that columnThe "neighbors" table would be much more useful with the following features:
* A column showing the network SNR for CBC events
* The ability to click on a column to sort the table according to that columnhttps://git.ligo.org/computing/gracedb/server/-/issues/198Comments on GraceDB's Visual Interface2020-09-15T09:52:26ZAlexander PaceComments on GraceDB's Visual InterfaceHi All:
I'm opening up the forum to comments regarding GraceDB's visual interface. There are probably many items that I haven't caught yet, so I would appreciate any constructive feedback about anything you have encountered in the past ...Hi All:
I'm opening up the forum to comments regarding GraceDB's visual interface. There are probably many items that I haven't caught yet, so I would appreciate any constructive feedback about anything you have encountered in the past two months since I pushed the interface to [Playground](https://gracedb-playground.ligo.org/) two months ago.
Mostly I'm looking for:
* Lack of functionality that was present in the old site that isn't obviously available.
* Obviously broken visual interface elements (links and screenshots would be appreciated).
* Constructive feedback regarding the interface. I'm looking for comments along the lines of "visual element X should display Y information" or "it would be more clear if this element showed XYZ". Comments like "doesn't look good" do not give me much to work with and are largely subjective.
This ticket should be long-lasting, but I'm going to close the "official" comment period in one week (on June 25) so I can push changes into production with the next version of the [server code](https://git.ligo.org/lscsoft/gracedb/tree/gracedb-2.10.0). I'm not expecting it to be perfect, but I'd rather get an 85% done product out and then polish things as they come up since we're not in observation.
Thanks.https://git.ligo.org/computing/gracedb/server/-/issues/200Proposals and comments for curated event pages2021-09-15T16:11:46ZAlexander PaceProposals and comments for curated event pagesAs the public-facing GWTC-* (curated) event pages are developed, I'll be taking feedback on this ticket before going live. Further background on the curation process can be found here: https://dcc.ligo.org/LIGO-T2000569.
In particular,...As the public-facing GWTC-* (curated) event pages are developed, I'll be taking feedback on this ticket before going live. Further background on the curation process can be found here: https://dcc.ligo.org/LIGO-T2000569.
In particular, I'll be looking for feedback as to how to properly distill all the information that's already in GraceDB in such a way to be digestible and clear for readers outside of the analyst community.
The related OpenProject charge can be found here: https://cbcprojects.ligo.org/projects/gracedb-event-curation/Alexander PaceAlexander Pace2020-11-24https://git.ligo.org/computing/gracedb/server/-/issues/201Update O3 public alerts page to point to published catalog2020-11-23T16:55:28ZJonah KannerUpdate O3 public alerts page to point to published catalog@alexander.pace
Here's a suggestion from Beverly Berger to add a pointer to GWTC-2 on the O3 public alerts page.
-jonah
Hi, Jonah.
I had a thought about how to add information on the final resolution of alert triggers to https://grac...@alexander.pace
Here's a suggestion from Beverly Berger to add a pointer to GWTC-2 on the O3 public alerts page.
-jonah
Hi, Jonah.
I had a thought about how to add information on the final resolution of alert triggers to https://gracedb.ligo.org/superevents/public/O3/. I think it would be sufficient to say something like see (relevant table of events in GWOSC) and (section in the catalog paper where the reasons for the final set of events are given).
Beverlyhttps://git.ligo.org/computing/gracedb/server/-/issues/204Uploading results of targeted GRB/FRB followup searches2023-07-31T15:01:40ZTito Dal CantonUploading results of targeted GRB/FRB followup searchesIn at least two occasions, people have recently requested PyGRB candidates to be uploaded to GraceDB for detchar and PE followup. One difficulty with this is that PyGRB, being a targeted followup search of an existing transient, reports ...In at least two occasions, people have recently requested PyGRB candidates to be uploaded to GraceDB for detchar and PE followup. One difficulty with this is that PyGRB, being a targeted followup search of an existing transient, reports a p-value (false-alarm probability) associated with data around that particular transient, instead of a false-alarm rate as commonly understood in GraceDB land. PyGRB is also a coherent search, which may create some more impedance mismatch in terms of LIGOLW tables. A similar issue would also arise for any X-pipeline candidate (also from targeted GRB/FRB followup searches), although we have not had any particularly interesting X-pipeline candidates yet.
@alexander.pace suggested opening this issue, but I am not entirely sure if this is a PyGRB/X-pipeline problem, or a GraceDB problem, or maybe more of a schema problem. If we had an interesting continuous-wave or stochastic candidate, for example, would people want to see that in GraceDB as well, and what would the solution be?
Tagging @derek.davis, @francesco-pannarale and @ian-harry.https://git.ligo.org/computing/gracedb/server/-/issues/206Proposals and comments for public release of data products2022-08-03T18:46:59ZAlexander PaceProposals and comments for public release of data productsTicket to track discussion regarding O4 release of low-latency data products.
Please refer to the minutes from day-one of the low-latency virtual face-to-face on September 15, 2021 [here](https://docs.google.com/document/d/1L0hLw1A3H20...Ticket to track discussion regarding O4 release of low-latency data products.
Please refer to the minutes from day-one of the low-latency virtual face-to-face on September 15, 2021 [here](https://docs.google.com/document/d/1L0hLw1A3H20Xphjj4vhuN7aLWd2roTmpwDav9PUzNXM/edit#).
In particular, the way public exposure of data products works currently is:
- Only information on a _superevent_'s page is made public. The includes the web front-end and queries from the API.
- G-_event_ pages are not made public. The result of the F2F discussion on 2021/09/15 seemed to indicate that it should remain that way.
- _Superevent_ properties (such as FAR) are inherited from the superevent's preferred event. For reference: a superevent's preferred event is a one-to-one relationship with an event in the database. See here: (https://git.ligo.org/lscsoft/gracedb/-/blob/master/gracedb/superevents/models.py#L92). As such, when a superevent defines its FAR is just a relationship with its preferred event, for example: https://git.ligo.org/lscsoft/gracedb/-/blob/master/gracedb/superevents/models.py#L431
- Data products, such as skymaps, p_astro and such, are uploaded for a superevent and given a couple of tags, such as `skyloc`, which gives the file its own section on a superevent landing page:
![Screen_Shot_2021-09-15_at_12.57.48_PM](/uploads/b6001318e20cb5d5f97973b5a40b632e/Screen_Shot_2021-09-15_at_12.57.48_PM.png)
- Adding a `public` tag will, understandably, make that file (log message) public by having the Django template generator filter based on that public tag (example: https://git.ligo.org/lscsoft/gracedb/-/blob/master/gracedb/templates/superevents/preferred_event_info_table_public.html)
- Similarly, there is a filter of the `public` tag, as well as an authentication check that takes place that selectively returns information via API calls.
- A superevent is a super-set of G-events. This is defined by a ForeignKey database relation (here: https://git.ligo.org/lscsoft/gracedb/-/blob/master/gracedb/events/models.py#L189). This relationship can be created or destroyed by `superevent_manager`s. There's no internal logic deciding this relationship.
Okay. So that's how it works right now. What I propose to _guide the discussion_ is this:
1. What information or products do you want from what events to be made public? Are these event properties, such as FAR and SNR, are there skymaps or other data products?
2. Are these G-event properties coming from G-events are are not already part of a superevent's event relationship? In other words, outside events that are not part of a superevent's event list? If the superevent and event are already linked, then from a technical standpoint, the superevent's landing page and API call can return the information to the public. This is a policy limitation, not a technical limitation. I do not dictate policy.
3. For data products such as skymaps, omega scans, and whatnot, the files are attached to log message objects in the database. Currently, the log message objects are linked via a one-to-one relationship in the database to either a superevent or a g-event.
Elaborating on the last point: concerning data products and files. What I propose from a technical standpoint would be expanding the log-message (and by extension, file product) relationship to include a many-to-many relationship. What this would mean is, a log message and file would be created for a g-event. That's relationship number one. A superevent manager process would determine if that product should be displayed on the superevent page. A new log message is created for the superevent, and that log message would have a linked relationship with the g-event's log message. The set of tags that are usually applied to a superevent's log message (`skyloc`, `public`, etc) are applied to the inherited log message, then boom, it's public.
I think this approach is a way to start and guide the discussion. I believe it has the advantage of:
1. Allowing superevent managers to selectively choose what information and what products are inherited from what g-events.
2. The mechanism for making data products public is unchanged.
3. Adding data products to a superevent (not just public products) is reduced to one API call instead of the convoluted data-transfer bonanza that took place in O3. This is analogous to the "server-side copy" that was discussed in O3b but never brought to fruition.
4. The existing superevent pages, log messages, public views, and event relationships will remain unchanged. Unless a superevent manager changes them.
I see this as a sound technical approach, but I am open to other suggestions. From a policy standpoint, it is up to @erik-katsavounidis and @shaon.ghosh to determine what information from what events and at what time should be linked to a superevent.O4 CBC Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/207GraceDB considerations before Oct. 2021 MDC2021-10-05T01:37:24ZAlexander PaceGraceDB considerations before Oct. 2021 MDCI would like to be on the same page and tie up some loose ends before October 2021's MDC event. Specifically, please confirm that pipelines and GWCelery will be using the `gracedb-playground` and `lvalert-playground` infrastructure.
Ad...I would like to be on the same page and tie up some loose ends before October 2021's MDC event. Specifically, please confirm that pipelines and GWCelery will be using the `gracedb-playground` and `lvalert-playground` infrastructure.
Additionally, could pipeline head please confirm which `search`es they will be using for event uploads? I will then confirm that the appropriate LVAert nodes are in place to send messages?
Thanks. I'll add more to this ticket as they come up.
`CWB`, `gstlal`, `MBTAOnline`, `oLIB`, `pycbc`, `spiir`https://git.ligo.org/computing/gracedb/server/-/issues/208Request to change sname format2021-11-01T15:17:31ZGregory Ashtongregory.ashton@ligo.orgRequest to change sname format## Description of feature request
Change the sname format from `SYYMMDDabc` to `SYYMMDD_HHMMSSabc`
## Use cases
## Benefits
<!-- Describe the benefits of adding this feature -->
For anyone relating events from papers to GraceDB, th...## Description of feature request
Change the sname format from `SYYMMDDabc` to `SYYMMDD_HHMMSSabc`
## Use cases
## Benefits
<!-- Describe the benefits of adding this feature -->
For anyone relating events from papers to GraceDB, this removes ambiguity and the need to remember which sname maps to which GW name. Or, at least it reduces the number of cases (I think to zero with good confidence, though there could be edge cases).
## Drawbacks
<!--
Are there any drawbacks to adding this feature?
Can you think of any ways in which this will negatively affect the service for any set of users?
-->
snames become longer and "uglier". However, the choice for GW names to be long and ugly has already been made. So this just provides better consistency.
## Suggested solutions
<!-- Do you have any ideas for how to implement this feature? -->
For O4 onwards only. Though, it could also be applied retroactively to GW events not in GraceDB already.Alexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/209Re-enable X-Pipeline2023-02-22T15:36:48ZAlexander PaceRe-enable X-Pipeline**Background:** `X-Pipeline` or just `X` has been floating around in GraceDB since way before I've been on this project, but has [never uploaded an event](https://gracedb.ligo.org/search/?query=X&query_type=E&results_format=S). I was att...**Background:** `X-Pipeline` or just `X` has been floating around in GraceDB since way before I've been on this project, but has [never uploaded an event](https://gracedb.ligo.org/search/?query=X&query_type=E&results_format=S). I was attempting to clean up event logic back in mid 2020, and so I added `X`, `Q`, and `Omega` pipelines to a list of pipelines that were being phased out, and [returned a warning message](https://git.ligo.org/lscsoft/gracedb/-/blob/ca14eb8e8eb0c4111ecac38d6e879472fea1b111/gracedb/api/v1/events/views.py#L524-L525) if a user attempted to upload an event to that pipeline. Not that it would have worked anyway, because the [logic](https://git.ligo.org/lscsoft/gracedb/-/blob/master/gracedb/events/view_logic.py#L66-L67) to ingest X-pipeline event files had never actually been implemented and likely would have returned an error.
I received a request over [mattermost](https://chat.ligo.org/ligo/pl/t1zgpxrm8fdz8qk4xrk4jiueby) to revive the pipeline.
Before proceeding with this, I need from @amber-stuver:
1) An example event upload. I don't know the output file format (xml? json?) or what fields that are in the file should be stored in the database. I can look it over as a first step to compare to other event types that are in GraceDB, but I need the file first and foremost. It can be attached to this ticket.
2) What kind of search type is it? Right now, GraceDB ingests events from `CoincInspiral` searches, `GRB` searches, `Multiburst` searches, etc. If `X` fits into one of those categories, storing event data and constructing the view and `REST` response is simpler. But this will make more sense when I get the example upload.
3) Who is going to be uploading and populating the pipeline? If it's individual users, I need just your `@LIGO.org` email address. Or, if there is a robot account that is uploading, please apply for a cert from https://robots.ligo.org/ and then I'll add it as an uploader.
Once I get the example upload and the other information that I need, the steps I need to do are:
1) Remove `X` from the depreciated pipelines list.
2) Add logic to `view_logic.py` to read in `X` events.
3) Determine views for `X` events, add it to settings.
4) Add uploader permissions for the pipeline
5) Test event uploads, ingestion into the database, and webpage views.
6) Make appropriate LVAlert topics for `X-pipeline`
Then I'll have to push a server code change and deploy it.
Drop any questions or sample files you have here and I'll get back to you.Alexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/210Reducing queries by packing igwn-alert2022-03-18T01:03:58ZAlexander PaceReducing queries by packing igwn-alertAs discussed on low-latency call, December 8 2021. The purpose of this ticket is to solicit feedback and suggestions for what information to include in `igwn-alert` packets with the goal of reducing costly queries to GraceDB.
Relevant p...As discussed on low-latency call, December 8 2021. The purpose of this ticket is to solicit feedback and suggestions for what information to include in `igwn-alert` packets with the goal of reducing costly queries to GraceDB.
Relevant past commits:
* https://git.ligo.org/lscsoft/gracedb/-/commit/e52b0c2ea248efbbb221ed51c58d55a3e5c4a3de
* https://git.ligo.org/lscsoft/gracedb/-/commit/2402e914dd6afd28035f6d06086bd6519f8018a9
Current MR's:
* https://git.ligo.org/lscsoft/gracedb/-/merge_requests/52/O4 Infrastructure ImprovementsAlexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/211Add SciTokens support2023-07-17T15:07:39ZDuncan Macleodduncan.macleod@ligo.orgAdd SciTokens supportThe GraceDB server needs to accept a SciToken as an authorisation option.The GraceDB server needs to accept a SciToken as an authorisation option.O4 AdvanceDuncan MeacherDuncan Meacherhttps://git.ligo.org/computing/gracedb/server/-/issues/212Metadata for triggers and candidate events2022-03-22T13:38:22ZAlexander PaceMetadata for triggers and candidate eventsThe [charge](https://dcc.ligo.org/LIGO-T2100502) for O4 data product management states:
> Metadata: We define metadata as a set of lightweight data products or links to data products
associated with a given trigger. For example, lightwe...The [charge](https://dcc.ligo.org/LIGO-T2100502) for O4 data product management states:
> Metadata: We define metadata as a set of lightweight data products or links to data products
associated with a given trigger. For example, lightweight data products such as the FAR and SNR or
paths links to parameter estimation posteriors.
I think this can be accomplished with a new table that's linked with a 1:1 foreign key to a g-event. Also from the charge, a proposed format the metadata is below. Inserted as a screenshot since copy/paste from pdf totally gnarled up the formatting.
![Screen_Shot_2022-02-28_at_9.29.27_PM](/uploads/92a3e044fab88bcb533741c5f9604d15/Screen_Shot_2022-02-28_at_9.29.27_PM.png)
I think a first cut would involve taking advantage of postgres' [json datatype](https://www.postgresql.org/docs/9.4/datatype-json.html).
Querying for metadata would have to be implemented too.O4 CBC Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/213Superevent "flattening"2022-03-18T01:03:59ZAlexander PaceSuperevent "flattening"I was really bad about documenting commits on this branch: https://git.ligo.org/computing/gracedb/server/-/tree/new_event_superevent_types
But basically it entailed "flattening" the table structure for superevents, such that the `supere...I was really bad about documenting commits on this branch: https://git.ligo.org/computing/gracedb/server/-/tree/new_event_superevent_types
But basically it entailed "flattening" the table structure for superevents, such that the `superevent_id` was no longer a python property constructed from the date id and such. This will go a LONG way to improve page load times and superevent queries. It also cuts down on a bunch of regex's throughout the code that decomposed the superevent_id back into dateids.
I had used the [django-computedfields](https://pypi.org/project/django-computedfields/) package that worked pretty well. But maybe there's a more postgres-y way to do this.
Also since events' GIDs are constructed from one letter already in the database along with a row id, I think we basically get graceid's in the database for free as well.O4 CBC Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/218Validate the CBC meta data2022-03-15T10:31:20ZGregory Ashtongregory.ashton@ligo.orgValidate the CBC meta dataThe uploaded meta data #217 can be validated by the [JSON schema](https://git.ligo.org/cbc/meta-data/-/blob/main/cbc-meta-data.schema). Note this is not yet complete. A v1 draft is in preparation.The uploaded meta data #217 can be validated by the [JSON schema](https://git.ligo.org/cbc/meta-data/-/blob/main/cbc-meta-data.schema). Note this is not yet complete. A v1 draft is in preparation.https://git.ligo.org/computing/gracedb/server/-/issues/219Port labels from catalog-dev2023-07-21T15:10:39ZAlexander PacePort labels from catalog-devPlease prepare a list of labels that are on `catalog-dev.lig.org` that will need to be in place on GraceDB to transfer events and shut down `catalog-dev`. As a first cut, I came up with:
```
In [1]: from ligo.gracedb.rest import GraceD...Please prepare a list of labels that are on `catalog-dev.lig.org` that will need to be in place on GraceDB to transfer events and shut down `catalog-dev`. As a first cut, I came up with:
```
In [1]: from ligo.gracedb.rest import GraceDb
In [2]: cdev = GraceDb('https://catalog-dev.ligo.org/api/')
In [3]: gdb = GraceDb('https://gracedb.ligo.org/api')
In [4]: catalog_dev_labels = cdev.allowed_labels
In [5]: gracedb_labels = gdb.allowed_labels
In [6]: set(catalog_dev_labels) - set(gracedb_labels)
Out[6]:
{'CHUNK_1',
'CHUNK_10',
'CHUNK_11',
'CHUNK_12',
'CHUNK_13',
'CHUNK_14',
'CHUNK_15',
'CHUNK_16',
'CHUNK_17',
'CHUNK_18',
'CHUNK_19',
'CHUNK_2',
'CHUNK_20',
'CHUNK_21',
'CHUNK_22',
'CHUNK_23',
'CHUNK_24',
'CHUNK_25',
'CHUNK_26',
'CHUNK_27',
'CHUNK_28',
'CHUNK_29',
'CHUNK_3',
'CHUNK_30',
'CHUNK_31',
'CHUNK_32',
'CHUNK_33',
'CHUNK_34',
'CHUNK_35',
'CHUNK_36',
'CHUNK_37',
'CHUNK_38',
'CHUNK_39',
'CHUNK_4',
'CHUNK_40',
'CHUNK_5',
'CHUNK_6',
'CHUNK_7',
'CHUNK_8',
'CHUNK_9',
'CHUNK_UNKNOWN',
'DETCHAR_NO',
'DETCHAR_YES',
'DQ_NO',
'DQ_YES',
'FINAL',
'GDB_NO',
'GDB_YES',
'NONE',
'O3A_CAT_NO',
'O3A_CAT_YES',
'O3A_CBC_CATALOG',
'O3A_CBC_FINAL',
'O3A_CBC_SUBTHRESHOLD',
'O3A_CWB_FINAL',
'O3A_CWB_ONLY',
'O3A_EVENT_FOR_O3B',
'O3A_SSM',
'O3B_CBC_CATALOG',
'O3B_CBC_SUBTHRESHOLD',
'O3B_CWB_ONLY',
'O3B_SSM',
'PE_NO',
'PE_YES',
'PRELIM'}
```
I think some thought needs to be given in terms of which ones to retain and which ones aren't necessary. For instance at first glance `NONE`, `GDB_NO`, `GDB_YES` probably don't make sense in the context of using GraceDB as the final event repository.
Once I get the list, I can add them to gracedb's deployment for testing.
@gregory.ashton @surabhi.sachdev @rebecca.ewingO4 CBC ImprovementsAlexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/220"terminating connection due to administrator command"2022-04-04T14:37:45ZAlexander Pace"terminating connection due to administrator command"Over the weekend (April 3-4), I woke up to about ~30 emails from `gracedb-test/playground` and then the following day, from `gracedb` (production) with the following error message:
```
Internal Server Error: /some/api/path
OperationalE...Over the weekend (April 3-4), I woke up to about ~30 emails from `gracedb-test/playground` and then the following day, from `gracedb` (production) with the following error message:
```
Internal Server Error: /some/api/path
OperationalError at /some/api/path
terminating connection due to administrator command
SSL connection has been closed unexpectedly
```
Okay? I had never seen that before. So it appears to be a thing with postgres/RDS. ex: https://old.reddit.com/r/aws/comments/b5l3ha/rds_giving_terminating_connection_due_to/
I went into the management console and saw messages like this for "recent events":
![Screen_Shot_2022-04-04_at_10.28.21_AM](/uploads/6eafa63e416711b680f8db37bf87f369/Screen_Shot_2022-04-04_at_10.28.21_AM.png)
So the best I can gather from that and from the maintenance settings is that RDS triggered a minor version update and shutdown and restarted the databases automatically. Client connections were closed, and that's what caused the errors. So as a first cut, I disabled automatic updates, so that's something to keep an eye on for maintenance windows.
I also ducked into sentry and saw that gwcelery recorded the 500 httperror messages, so the clients saw it as well. Hopefully this doesn't pop up again. But i'm recording it here just in case.
Also, the line `SSL connection has been closed unexpectedly`.
For some reason by default postgres asks for an SSL connection? All the communication between the database and the EC2 nodes is behind the cloud and constrained to security groups, so I think we could get away with disabling it and reducing the connection overhead: https://www.postgresql.org/docs/current/libpq-ssl.htmlhttps://git.ligo.org/computing/gracedb/server/-/issues/222Asimov, Lensing support during O4 MDC2023-02-08T19:01:55ZAlexander PaceAsimov, Lensing support during O4 MDCThis ticket is to track changes and requests to support the Lensing Group during the ongoing O4 MDC.
@surabhi.sachdev @alvin.liThis ticket is to track changes and requests to support the Lensing Group during the ongoing O4 MDC.
@surabhi.sachdev @alvin.liO4 CBC Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/223Twilio SMS improvements2023-02-08T19:44:38ZAlexander PaceTwilio SMS improvementsThis ticket is intended as an information dump to make an informed decision about Twilio messaging from GraceDB in O4. Unfortunately, the SMS logs in Twilio's console only go back one year. So the fine-grained messaging logs for O3 are g...This ticket is intended as an information dump to make an informed decision about Twilio messaging from GraceDB in O4. Unfortunately, the SMS logs in Twilio's console only go back one year. So the fine-grained messaging logs for O3 are gone, but the billing rates are available. GraceDB writes a log message with a "`Texting...`" keyword, and those logs are gzipped and archived from O3. So it would be possible-- if need be-- to scrape a month's worth of logs to get the absolute number of messages sent and then extrapolate that to O4. But, you know, effort.
## Number of users and types of alerts
This is the result of digging around in the database to get some idea of the number of users and alert types that are live in GraceDB. Note that in the following nomenclature, a "Notification" object contains a unique set of parameters that dictates when and how an alert goes out to a user. Using @chad-hanna as an example (phone number redacted):
```
> chad=User.objects.get(username='chad.hanna@LIGO.org')
> Notification.objects.filter(user=chad)
<QuerySet [<Notification: chad.hanna@LIGO.ORG: Superevent created or updated & FAR < 1.92901235e-07 -> Call and text +1814XXXXXXX>, <Notification: chad.hanna@LIGO.ORG: Event created or updated & group=CBC & pipeline=gstlal & search=AllSky & FAR < 1.92901235e-07 -> Call and text +1814XXXXXXX>, <Notification: chad.hanna@LIGO.ORG: Event labeled with EM_COINC & any group & any pipeline & any search -> Call and text +1814XXXXXXX>]>
```
So in this example, he signed up for calls/texts for 1) low-far superevent creation, 2) low-far gstlal event uploads, and 3) event uploads with an EM_COINC label applied. In this scenario, there is one user, but three distinct "Notifications". That being said:
* Number of distinct notifications: 649
* Superevent notifications: 586
* Event notifications: 63
* Number of distinct users: 364
* Distinct users signed up for Event notifications: 60 (includes emails)
* Distinct users with Event call and/or sms notifications: 11
* Number of distinct Event call and/or sms notifications: 16
* Event notifications/user: 16/11=**1.5**
* Distinct users signed up for Superevent notifications: 345 (includes emails)
* Distinct users with Superevent call and/or sms notifications: 241
* Number of distinct Superevent call and/or sms notifications: 413
* Superevent notifications/user: 413/241=**1.7**
* Number of unique SMS notifications: 304
* Number of unique phone call notifications: 30
* Number of call+text notifications: 95
I'll dump more stats in here later on, but that should be a start. I think a first cut should be take the unique number of users for superevent/event notifications and scale that up by how much the collaboration has grown between O3--> O4. This assumes a constant percentage of collaboration members signed up for texts and calls. Then scale that by the expected increase in superevent/event rates in O4, and then use the call/sms per superevent/event per user to find out an expected messaging rate. For now that is left as an exercise for the reader.
## Batch processing of calls and SMS alerts
Getting this operation down to a single database query (which is totally doable), and then a single API call to twilio would save **loads** of time in generating alerts. Via the [documentation](https://support.twilio.com/hc/en-us/articles/223181548-Can-I-set-up-one-API-call-to-send-messages-to-a-list-of-people-):
> Each new SMS message from Twilio must be sent with a separate REST API request. To initiate messages to a list of recipients, you must make a request for each number to which you would like to send a message. The best way to do this is to build an array of the recipients and iterate through each phone number.
Weak.
## Prioritizing recipients
This is also a way to get messages out to the people who need them faster. At the end of O3 it was decided to pare down the number of G-Event recipients to a fixed list of pipeline and followup and control room recipients. I propose to formalize that list, and then have it as a community entity in LIGO's LDAP (similar to how `Communities:LVC:GraceDB:GraceDBAdvocates` exists for EM advocates). Then GraceDB will assign a priority to each group (so pipeline experts=2, em advocates=1, everyone else=0), then sort the list of SMS recipients via this priority and then start the messaging loop.
I'm open to other ideas, but this is a start of a strategy for defining twilio account messaging rates and prices.Critical Path O4 DevelopmentAlexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/224Study of missing notifications during O3 (not attempted by Twilio)2023-05-03T14:48:43ZPeter ShawhanStudy of missing notifications during O3 (not attempted by Twilio)During O3, some people reported not receiving notifications according to how they had configured GraceDB to send them notifications. A small fraction of people reported this, but it seemed to be consistent, i.e. not sporadic. I spent som...During O3, some people reported not receiving notifications according to how they had configured GraceDB to send them notifications. A small fraction of people reported this, but it seemed to be consistent, i.e. not sporadic. I spent some time looking into it in the summer of 2019. For the record, here is a copy of some email messages I sent to a few people (principally Tanner) at that time.
## Email on July 25, 2019:
I have gotten input from a number of people and cross-checked with the Twilio logs. I have not figured out what is happening, but I have learned some things so I thought I would distill my notes and share them with you.
* The problems people are having are with Call and Text notifications, not Email notifications. Well, I haven't paid much attention to what people mentioned about email notifications, so there could be problems there too, but anyway the problems are not ALL with Email notifications. The people who have communicated with me are primarily relying on calls and/or texts.
* The Twilio logs corroborate what people have told me. e.g. if they said they haven't gotten text messages and phone calls recently, the Twilio logs agree: it really looks like Twilio was not asked to call/text them. (Well, occasionally a phone call will fail and that will be shown in the Twilio log, but that is not common. It's not the explanation for people's reports of missing notifications.)
* Lots of people ARE being notified of relevant events. For instance, when S190718y was marked by ADVREQ, the Twilio logs list 98 text messages and 41 voice calls to people to notify them. When S190720a was labeled with ADVREQ, I see 112 text messages; I didn't count the voice calls in that case. When S190724g was labeled with EM_COINC, I see 67 text messages delivered and about 42 voice calls, most of which went through and were answered.
* Some people are receiving notifications reliably, while others are not receiving any. Some people used to receive notifications but have not been receiving them recently. A few people have observed that it seems like people who set up notifications a long time ago are receiving them, while people who set up notifications recently tend not to be receiving them.
So I think there are two general types of possible reasons: either (1) some call/text requests passed to Twilio are getting lost before Twilio attempts them, or (2) there is something funny in the software that the gracedb server is using to construct the list of contacts to call or text, leading it to omit some. (e.g., before I started looking into this, I had a hypothesis that a database query was being used to get the list of contacts and there was a maximum number of records returned by the query. But having looked at the code, that doesn't fit.)
I know you mentioned that logging is not working reliably on AWS; that's too bad, because from gracedb/alerts/phone.py I can see that every call/text attempt passed to Twilio is being logged. If you have a log file that you believe to be complete for some time that includes an event, I could compare it against the Twilio logs (which I have now exported into spreadsheets, cumulative since January).
There is a note here that "You can send messages to Twilio at a rapid rate as long as the requests do not reach Twilio's API concurrency limit which is at 100", but I don't THINK we would be running into that since call/text requests are made serially and I'm positive that Twilio is designed to queue requests and feed them out at the appropriate rate.
In terms of the software in the gracedb server, I spent some time studying the code in the gracedb/alerts directory, but it is complex enough that I can't trace it by inspection to check how it filters to get a list of matching notifications and then looks up the contact information from the notifications.
If you want a case to try debugging, look at Giacomo Ciani. His notification settings are:
~~~
Notifications
Once per year | Superevent created or updated & FAR < 3e-08 -> Text +393476487948, Email giacomo.ciani@unipd.it
Advocate request | Superevent labeled with ADVREQ -> Email giacomo.ciani@unipd.it, Call and text +393476487948
~~~
For S190720a he received several "A superevent with GraceDB ID S190720a was updated" text messages and emails (nobody received a "superevent created" message for that because the initial preferred event had too high a FAR), but did not receive any ADVREQ label messages for either S190720a or S190718y, either by text or by voice call. So it seems that the first of his two notifications was acted on but the second was not.
## Email on July 26, 2019:
I've taken some more time to digest the input I've received (and collected notes in a Google doc: https://docs.google.com/document/d/1QzDS-JWxi2EAXgYP64sKJaNde29x7MN7v0JtA5IGxl8/edit). Here are my high-level findings:
* For some people, all of their notifications are working.
* Working or not working seems to be associated with specific "lines" in a user's notifications configuration. For some people, SOME of their notification lines are working while others are not. For instance, Jenne Driggers has four notification lines configured, but only the first of them is working (i.e. generating text messages logged by Twilio); the other three lines are having no effect. Looking back through the logs, it seems that has been the case since she created the first two lines in early April, and added two more lines around July 18 or 19: her first notification line (which is interesting because it includes the NS candidate condition) has been working reliably, while none of the other lines has produced any notification through Twilio. Similarly, for Giacomo Ciani and Andrea Miani, their first notification line has been working while their second has not. Marco Bazzan's SECOND line has been working while his first has not.
* At least one user -- Daniel Sigg -- has two notification lines and neither is working. Daniel has in the past received phone calls through Twilio (on July 1 and 6), but he updated his alerts configuration and has not gotten any notifications since July 6.
* Giacomo Ciani, whom I mentioned above, added another line to his configuration today with voice call notifications, and it worked.
So my best picture of this is that some notification lines work and others don't. I can't tell what determines which ones work and which ones don't. It does seem to be the case that notification lines established a long time ago, or first in a person's list, are more likely to work; but that does not seem to be universal. And anyway, it's pretty clear to me that this is a GraceDB bookkeeping issue of some sort, not a problem with Twilio or individual users' phones or cell providers.
Oh, and for people who are not getting notifications according to their configuration, when they use the Test buttons to send test notifications, those work. (In most cases... Deep does not seem to be able to receive calls or text messages from Twilio on his phone.)O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/225Uploading sky-maps with the MLy pipeline2023-02-08T19:49:34Zkyle willettsUploading sky-maps with the MLy pipelineWe (@mly) would like to be able to upload sky-maps to GraceDB in low-latency, ideally when publishing an event. Would it be possible to modify the upload file format we are currently using, to include a sky-map file?We (@mly) would like to be able to upload sky-maps to GraceDB in low-latency, ideally when publishing an event. Would it be possible to modify the upload file format we are currently using, to include a sky-map file?O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/226Request for voevent IVORNs in superevent dictionary2023-07-05T19:49:00ZCody MessickRequest for voevent IVORNs in superevent dictionaryWould it be possible to include IVORNs for all VOEvents on a given superevent in the superevent dictionary? Doing so would allow the GWCelery team to populate two VOEvent fields without additional gracedb queries, specifically the citati...Would it be possible to include IVORNs for all VOEvents on a given superevent in the superevent dictionary? Doing so would allow the GWCelery team to populate two VOEvent fields without additional gracedb queries, specifically the citations sections and the `Pkg_Ser_Num` field.
Do non-LVK generated VOEvents ever end up in gracedb (e.g. from an external observation that is coincident with the GW)? I ask because my current mental model for determining `Pkg_Ser_Num` is just to count the IVORNs, i.e. if there are no IVORNs we assume the VOEvent we're generating is the first, if there's one IVORN we assume it's the second, etc. If VOEvents could show up from other events, that logic might need some additional checks.
Related to https://git.ligo.org/emfollow/gwcelery/-/merge_requests/857Critical Path O4 Developmenthttps://git.ligo.org/computing/gracedb/server/-/issues/229Migrate from ConcurrentLogHandler to concurrent-log-handler2022-08-11T23:24:35ZDaniel WysockiMigrate from ConcurrentLogHandler to concurrent-log-handler`requirements.txt` lists `ConcurrentLogHandler==0.9.1`, which is a package which was [last updated in 2013](https://pypi.org/project/ConcurrentLogHandler/), and makes use of the `use_2to3` feature of `setuptools<58`. We will be stuck wi...`requirements.txt` lists `ConcurrentLogHandler==0.9.1`, which is a package which was [last updated in 2013](https://pypi.org/project/ConcurrentLogHandler/), and makes use of the `use_2to3` feature of `setuptools<58`. We will be stuck with older versions of `setuptools` until this dependency is replaced, which may eventually become a problem.
Fortunately, one of the two maintainers forked the project as [`concurrent-log-handler`](https://pypi.org/project/concurrent-log-handler/), and has updated it as recently as this year. Changing our requirement to `concurrent-log-handler==0.9.20` gets me past the build issue on newer `setuptools` versions. It's also necessary to change the import from `cloghandler` to `concurrent_log_handler`. Beyond that I have not done further testing, so it may not be a drop-in replacement.BacklogDaniel WysockiDaniel Wysockihttps://git.ligo.org/computing/gracedb/server/-/issues/232Request to add external event info to igwn-alert2023-02-13T16:25:45ZCody MessickRequest to add external event info to igwn-alertCurrently both emfollow/gwcelery!857 and emfollow/gwcelery!852 download external events from gracedb to populate public alerts. Could the external event info just be included in the IGWN-Alert? The only catch that I see is that we need t...Currently both emfollow/gwcelery!857 and emfollow/gwcelery!852 download external events from gracedb to populate public alerts. Could the external event info just be included in the IGWN-Alert? The only catch that I see is that we need to be able to tell which event to use, @brandon.piotrzkowski said information this should be in the `em_type` field, so all we'd need is some way to identify the event that would be mentioned in that field.Critical Path O4 Developmenthttps://git.ligo.org/computing/gracedb/server/-/issues/233AWS resources for non-production GraceDB2023-02-08T19:04:20ZErik KatsavounidisAWS resources for non-production GraceDBGiven the heavy development currently in progress for the low latency alerts pipeline and the use of non-production GraceDB tiers, we will need to bring such tiers up to the same level of hardware resources under AWS with the production ...Given the heavy development currently in progress for the low latency alerts pipeline and the use of non-production GraceDB tiers, we will need to bring such tiers up to the same level of hardware resources under AWS with the production system.O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/235Occasional 500 error when reading files2024-03-19T00:41:56ZAlexander PaceOccasional 500 error when reading filesThere is an occasional 500 error returned by the cloud instances when attempting to read files. It occurs infrequently and randomly enough that I'm not able to reproduce it, but it does it gwcelery's workflow on occasion (~2 times per we...There is an occasional 500 error returned by the cloud instances when attempting to read files. It occurs infrequently and randomly enough that I'm not able to reproduce it, but it does it gwcelery's workflow on occasion (~2 times per week). And example error traceback looks like:
```
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/django/core/handlers/exception.py", line 47, in inner
response = get_response(request)
File "/usr/local/lib/python3.7/dist-packages/django/core/handlers/base.py", line 181, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/usr/local/lib/python3.7/dist-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func
response = view_func(request, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
return view_func(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/rest_framework/viewsets.py", line 125, in view
return self.dispatch(request, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/rest_framework/views.py", line 509, in dispatch
response = self.handle_exception(exc)
File "/usr/local/lib/python3.7/dist-packages/rest_framework/views.py", line 469, in handle_exception
self.raise_uncaught_exception(exc)
File "/usr/local/lib/python3.7/dist-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
raise exc
File "/usr/local/lib/python3.7/dist-packages/rest_framework/views.py", line 506, in dispatch
response = handler(request, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/usr/local/lib/python3.7/dist-packages/retry/api.py", line 74, in retry_decorator
logger)
File "/usr/local/lib/python3.7/dist-packages/retry/api.py", line 33, in __retry_internal
return f()
File "/app/gracedb_project/gracedb/api/v1/superevents/views.py", line 321, in list
file_list = get_file_list(viewable_logs, parent_superevent.datadir)
File "/app/gracedb_project/gracedb/core/file_utils.py", line 32, in get_file_list
pointed_to = os.path.basename(os.path.realpath(full_path))
File "/usr/lib/python3.7/posixpath.py", line 395, in realpath
path, ok = _joinrealpath(filename[:0], filename, {})
File "/usr/lib/python3.7/posixpath.py", line 443, in _joinrealpath
path, ok = _joinrealpath(path, os.readlink(newpath), seen)
Exception Type: OSError at /api/superevents/MS220919n/files/
Exception Value: [Errno 5] Input/output error: '/app/db_data/9a/6/8ac9f1720d59940bed2d8e384d57c98049c82/bayestar.multiorder.coherence.png'
```
It appears to be triggering the [retrying](https://git.ligo.org/computing/gracedb/server/-/commit/71daf97148ef21e858039343ba4dc6c60eb6f208) hook that I put in, but it doesn't seem to work because it is retying four times to get the file, sleeping one second between each attempt:
```
gracedb-swarm-test-us-west-2a-docker-mgr-01.log:Sep 19 13:38:13 gracedb-swarm-test-us-west-2a-docker-mgr-01 gracedb_docker_gracedb_gracedb.2.o400wqmzk6yutaoaz1cd8mjyt: DJANGO | 2022-09-19 13:38:13.591 | e459e5951d2a | 10.0.2.51 | api.v1.superevents.views | WARNING | api.py, line 40 | [Errno 5] Input/output error: '/app/db_data/9a/6/8ac9f1720d59940bed2d8e384d57c98049c82/bayestar.multiorder.coherence.png', retrying in 1.0 seconds...
gracedb-swarm-test-us-west-2a-docker-mgr-01.log:Sep 19 13:38:14 gracedb-swarm-test-us-west-2a-docker-mgr-01 gracedb_docker_gracedb_gracedb.2.o400wqmzk6yutaoaz1cd8mjyt: DJANGO | 2022-09-19 13:38:14.608 | e459e5951d2a | 10.0.2.51 | api.v1.superevents.views | WARNING | api.py, line 40 | [Errno 5] Input/output error: '/app/db_data/9a/6/8ac9f1720d59940bed2d8e384d57c98049c82/bayestar.multiorder.coherence.png', retrying in 1.0 seconds...
gracedb-swarm-test-us-west-2a-docker-mgr-01.log:Sep 19 13:38:15 gracedb-swarm-test-us-west-2a-docker-mgr-01 gracedb_docker_gracedb_gracedb.2.o400wqmzk6yutaoaz1cd8mjyt: DJANGO | 2022-09-19 13:38:15.622 | e459e5951d2a | 10.0.2.51 | api.v1.superevents.views | WARNING | api.py, line 40 | [Errno 5] Input/output error: '/app/db_data/9a/6/8ac9f1720d59940bed2d8e384d57c98049c82/bayestar.multiorder.coherence.png', retrying in 1.0 seconds...
gracedb-swarm-test-us-west-2a-docker-mgr-01.log:Sep 19 13:38:16 gracedb-swarm-test-us-west-2a-docker-mgr-01 gracedb_docker_gracedb_gracedb.2.o400wqmzk6yutaoaz1cd8mjyt: DJANGO | 2022-09-19 13:38:16.636 | e459e5951d2a | 10.0.2.51 | api.v1.superevents.views | WARNING | api.py, line 40 | [Errno 5] Input/output error: '/app/db_data/9a/6/8ac9f1720d59940bed2d8e384d57c98049c82/bayestar.multiorder.coherence.png', retrying in 1.0 seconds...
```
`Traefik` is showing that the request is returning a 500 error and is taking almost five seconds because of the retries:
```
Sep 19 13:38:18 gracedb-swarm-test-us-west-2a-docker-mgr-01 gracedb_docker_webgateway_webgateway.1.l4j2u8hibrrtgelvsfhiubxfh: 131.215.113.198 - - [19/Sep/2022:13:38:13 +0000] "GET /api/superevents/MS220919n/files/ HTTP/1.1" 500 10472 "-" "-" 174967 "gracedb@docker" "http://10.0.2.51:80" 4815ms
```
For reference the nfs mounts are mounted with: `nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport,_netdev`O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/237More flexible queries for text/email alerts2023-02-08T16:56:01ZRebecca EwingMore flexible queries for text/email alerts## Description of feature request
<!--
Describe your feature request!
Is it a web interface change? Some underlying feature? An API resource?
The more detail you can provide, the better.
-->
For text/email alerts there are only a few op...## Description of feature request
<!--
Describe your feature request!
Is it a web interface change? Some underlying feature? An API resource?
The more detail you can provide, the better.
-->
For text/email alerts there are only a few options available, mostly just to choose a FAR threshold and set of labels. It would be useful if we could filter by additional parameters.
In general, if it's possible to support arbitrary queries for the alert rules that would be great.
## Use cases
<!-- List some specific cases where this feature will be useful -->
Getting alerted for public events while ignoring events from injection channels, so we don't get flooded with unnecessary alerts.
A query for this would be like `si.channel != "GDS-CALIB_STRAIN_INJ1_O3Replay" & si.channel != "Hrec_hoft_16384Hz_INJ1_O3Replay"` (I'm not sure exactly what the right syntax is.)
## Benefits
<!-- Describe the benefits of adding this feature -->
Adding this feature would make the alerts more general / flexible which should be a good thing.
## Drawbacks
<!--
Are there any drawbacks to adding this feature?
Can you think of any ways in which this will negatively affect the service for any set of users?
-->
As long as the old method stays in place and people can just optionally specify a more complicated/specific query I can't think of any drawbacks.
## Suggested solutions
<!-- Do you have any ideas for how to implement this feature? -->O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/239Remove query parsers' dependence on database state2023-02-08T16:08:49ZDaniel WysockiRemove query parsers' dependence on database stateThere are several database querying mini-languages written using the `pyparsing` module. The very bad decision was made to have the languages depend on the state of the database, by having things like labels and pipeline names be reserv...There are several database querying mini-languages written using the `pyparsing` module. The very bad decision was made to have the languages depend on the state of the database, by having things like labels and pipeline names be reserved words. This means any addition to the set of these values will require recompiling the parser, so as a result it's recompiled for _every query_. Speed considerations aside, this adds some serious complexity to the parsers, and means it's possible to break the parser by adding a badly named or non-unique value into one of the tables.
A much better approach would be to add a generic "identifier" token to the language. Then at code-generation time it would be resolved based on the database state.
To use Python as an analogy, consider what happens if one tries accessing an undefined variable
```python
>>> foo
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'foo' is not defined
```
note that this isn't a `SyntaxError`, as Python knows `foo` is a valid identifier, but is unbound. The parser read my statement without issue, but the code generation phase correctly identified the missing name. This should be how our query language works as well.O4 Debugging and ImprovementsDaniel WysockiDaniel Wysockihttps://git.ligo.org/computing/gracedb/server/-/issues/240Generate railroad diagrams for query parsing language2023-02-08T16:51:58ZDaniel WysockiGenerate railroad diagrams for query parsing language`pyparsing>=3.0.0` introduces the ability to generate ["railroad diagrams"](https://pyparsing-docs.readthedocs.io/en/latest/whats_new_in_3_0_0.html#id4), which are a concise way of visualizing a language. These would be very nice to hav...`pyparsing>=3.0.0` introduces the ability to generate ["railroad diagrams"](https://pyparsing-docs.readthedocs.io/en/latest/whats_new_in_3_0_0.html#id4), which are a concise way of visualizing a language. These would be very nice to have for our documentation, but more importantly would be helpful for making improvements to the query language without breaking anything.O4 Debugging and ImprovementsDaniel WysockiDaniel Wysockihttps://git.ligo.org/computing/gracedb/server/-/issues/242Revamp HardwareInjection event uploads.2023-02-08T19:49:48ZAlexander PaceRevamp HardwareInjection event uploads.This is to track work to bring back HardwareInjection events.
TODO:
- [x] provide sample json (?) upload
- [x] make data model
- [x] validate uploads
- [x] create page view
- [ ] determine what scenarios and alert contents should be
-...This is to track work to bring back HardwareInjection events.
TODO:
- [x] provide sample json (?) upload
- [x] make data model
- [x] validate uploads
- [x] create page view
- [ ] determine what scenarios and alert contents should be
- [ ] ????Critical Path O4 Developmenthttps://git.ligo.org/computing/gracedb/server/-/issues/244drop the banhammer on rogue processes2023-06-07T14:18:09ZAlexander Pacedrop the banhammer on rogue processesI (@alexander.pace) was trawling through production GraceDB's logs today (22-11-02) to sanity check that nothing was up with yesterday's deployment of the latest server code (https://git.ligo.org/computing/sccb/-/issues/1005), when i not...I (@alexander.pace) was trawling through production GraceDB's logs today (22-11-02) to sanity check that nothing was up with yesterday's deployment of the latest server code (https://git.ligo.org/computing/sccb/-/issues/1005), when i noticed a lot of traffic mostly performing `GET`s on (seemingly?) random `api/superevent/` paths. Okay? For example:
```
gracedb-swarm-production-us-west-2a-docker-mgr-01.log:Nov 2 00:00:04 gracedb-swarm-production-us-west-2a-docker-mgr-01 gracedb_docker_gracedb_gracedb.3.0wxlfddskqz0sxdcvafywkpv7: GUNICORN | 134.79.120.214 - - [02/Nov/2022:00:00:04 +0000] "GET /superevents/SIMS190408an_0p4_128/view/ HTTP/1.1" 404 5775 "-" "Python-urllib/2.7"
gracedb-swarm-production-us-west-2a-docker-mgr-01.log:Nov 2 00:00:05 gracedb-swarm-production-us-west-2a-docker-mgr-01 gracedb_docker_gracedb_gracedb.3.0wxlfddskqz0sxdcvafywkpv7: GUNICORN | 134.79.120.214 - - [02/Nov/2022:00:00:05 +0000] "GET /superevents/SIMS190408anC0p9N128/view/ HTTP/1.1" 404 5775 "-" "Python-urllib/2.7"
...
...
```
They were all `404`ing like they should, but it was a LOT of requests. For example, today, there were **15078** requests coming from the `134.79.120.*` subnet alone before I put the kibosh on that (more on that). Yesterday there were 18594 `GET`s. I say from that subnet because I saw requests coming from `134.79.120.214`, `134.79.120.195`, `134.79.120.165`...
I `traceroute`'ed the IPs back this group at Stanford (https://www6.slac.stanford.edu/).
I saw similar 404'ed `GET`s from a computer in Tokyo (`133.40.62.22`) that was trying to get files with wget?
```
gracedb-swarm-production-us-west-2c-docker-mgr-01.log:Nov 2 19:10:25 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_gracedb_gracedb.1.j9sj8bcpdvddfn4g0ss05kq6e: GUNICORN | 133.40.62.22 - - [02/Nov/2022:19:10:25 +0000] "GET /apiweb/superevents/IC136985_60401984/files/bayestar.fits.gz HTTP/1.1" 404 23 "-" "Wget/1.13.4 (linux-gnu)"
gracedb-swarm-production-us-west-2c-docker-mgr-01.log:Nov 2 19:10:28 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_gracedb_gracedb.1.j9sj8bcpdvddfn4g0ss05kq6e: GUNICORN | 133.40.62.22 - - [02/Nov/2022:19:10:28 +0000] "GET /api/superevents/IC137019_70165712/files/p_astro.json HTTP/1.1" 404 23 "-" "Wget/1.13.4 (linux-gnu)"
gracedb-swarm-production-us-west-2c-docker-mgr-01.log:Nov 2 19:10:29 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_gracedb_gracedb.1.j9sj8bcpdvddfn4g0ss05kq6e: GUNICORN | 133.40.62.22 - - [02/Nov/2022:19:10:29 +0000] "GET /apiweb/superevents/IC137019_70165712/files/bayestar.fits.gz HTTP/1.1" 404 23 "-" "Wget/1.13.4 (linux-gnu)"
gracedb-swarm-production-us-west-2c-docker-mgr-01.log:Nov 2 19:10:30 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_gracedb_gracedb.1.j9sj8bcpdvddfn4g0ss05kq6e: GUNICORN | 133.40.62.22 - - [02/Nov/2022:19:10:30 +0000] "GET /api/superevents/IC137065_22012496/files/p_astro.json HTTP/1.1" 404 23 "-" "Wget/1.13.4 (linux-gnu)"
gracedb-swarm-production-us-west-2c-docker-mgr-01.log:Nov 2 19:10:31 gracedb-swarm-production-us-west-2c-docker-mgr-01 gracedb_docker_gracedb_gracedb.1.j9sj8bcpdvddfn4g0ss05kq6e: GUNICORN | 133.40.62.22 - - [02/Nov/2022:19:10:31 +0000] "GET /apiweb/superevents/IC137065_22012496/files/bayestar.fits.gz HTTP/1.1" 404 23 "-" "Wget/1.13.4 (linux-gnu)"
```
They were all `404`'ed, but I'm concerned about the increased traffic especially when we go into observation. So, I made the executive decision to block traffic from the offending IPs/ranges. And if and when people start to complain, then we can push on a technical justification of what they were doing. And this doesn't apply to all robot processes of course. There are plenty of queries from IPs originating from caltech that are using the real client code, so those are obviously legit. But this ticket will be used to track which sources have been blocked from inbound traffic into gracedb's VPC.
| Date Blocked | IP Ranges | Reason | Status |
| ------ | ------ | ------ | ------ |
| 2022-11-02 | 134.79.120.0/24 | Excessive (15,000+/day) `GET`s | |
| 2022-11-02 | 133.40.62.22/32 | Excessive (10,000+/day) `GET`s | [Lifted](https://git.ligo.org/computing/helpdesk/-/issues/3943) 23/05/12|O4 Debugging and ImprovementsAlexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/249figure out why event queries are so convoluted2023-07-17T20:02:33ZAlexander Pacefigure out why event queries are so convolutedthere's something going on with how gracedb handles event searches, in particular when there are bulk searches with lots of results. so, for example if a user searches for all events for a given pipeline during an MDC period.
Example: ...there's something going on with how gracedb handles event searches, in particular when there are bulk searches with lots of results. so, for example if a user searches for all events for a given pipeline during an MDC period.
Example: There's [this line](https://git.ligo.org/computing/gracedb/server/-/blob/master/gracedb/api/v1/events/views.py#L404) that gets called when a user does an event query. GraceDB by default returns event results in batches of 10, and so in addition to pulling results from the database, it does that `count()` every time it collects a batch of 10 events.
That `count()` for a sample query gets translated into the following SQL:
```
SELECT COUNT(*) FROM (SELECT DISTINCT "events_event"."id" AS Col1, "events_event"."submitter_id" AS Col2, "events_event"."created" AS Col3, "events_event"."group_id" AS Col4, "events_event"."superevent_id" AS Col5, "events_event"."pipeline_preferred_id" AS Col6, "events_event"."pipeline_id" AS Col7, "events_event"."search_id" AS Col8, "events_event"."instruments" AS Col9, "events_event"."nevents" AS Col10, "events_event"."far" AS Col11, "events_event"."likelihood" AS Col12, "events_event"."gpstime" AS Col13, "events_event"."perms" AS Col14, "events_event"."offline" AS Col15, "events_event"."graceid" AS Col16, "events_event"."reporting_latency" AS Col17 FROM "events_event" INNER JOIN "events_group" ON ("events_event"."group_id" = "events_group"."id") INNER JOIN "events_pipeline" ON ("events_event"."pipeline_id" = "events_pipeline"."id") LEFT OUTER JOIN "events_search" ON ("events_event"."search_id" = "events_search"."id") WHERE ("events_group"."name" IN ('CBC') AND NOT ("events_group"."name" = 'Test') AND "events_pipeline"."name" IN ('pycbc') AND NOT ("events_search"."name" = 'MDC' AND "events_search"."name" IS NOT NULL) AND ("events_event"."id" IN (SELECT CAST(U0."object_pk" AS bigint) AS "obj_pk" FROM "guardian_userobjectpermission" U0 INNER JOIN "auth_permission" U2 ON (U0."permission_id" = U2."id") WHERE (U0."user_id" = 3901 AND U2."content_type_id" = 3 AND U2."codename" IN ('view_event'))) OR "events_event"."id" IN (SELECT CAST(U0."object_pk" AS bigint) AS "obj_pk" FROM "guardian_groupobjectpermission" U0 INNER JOIN "auth_group" U1 ON (U0."group_id" = U1."id") INNER JOIN "auth_user_groups" U2 ON (U1."id" = U2."group_id") INNER JOIN "auth_permission" U4 ON (U0."permission_id" = U4."id") WHERE (U2."user_id" = 3901 AND U4."codename" IN ('view_event') AND U4."content_type_id" = 3))))) subquery
```
Which on gracedb-playground, takes 1682.343ms to do, which is way long to begin with. Further, since it's doing it once for every 10 events, in this scenario where were were 80,000 events in the query, that's 80,000/10 = 8000 counts, and at 1.7 seconds per, that's like 13,600 seconds where the database is needless work and the user is just sitting there. Crazy.
So, I would start by:
1) Figure out why the ORM is turning a simple query (ref https://git.ligo.org/computing/gracedb/server/-/blob/8dcbbbfeff28ad195b8bf6128aec726d971ef227/gracedb/api/v1/events/views.py#L404) into that that monstrosity. I've attached as a file an example of what it looks like. [D26C0A10006C1BF220AA6B90D05B0611391D9431.txt](/uploads/63c4c00ad03ede10d07aaf4246b770c4/D26C0A10006C1BF220AA6B90D05B0611391D9431.txt)
2) Figure out why that `count()` takes so long
3) Reverse engineer the query response to see if we can move that `count()` outside of the iteration loop so it only does it once, stores the value, and then loops over the batches of 10 events.
I'm hoping that reverse engineering the `count()` will elucidate why the event query ends up being so taxing to the database.O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/250add ability to add multiple events to superevent in one call2023-02-08T21:36:55ZAlexander Paceadd ability to add multiple events to superevent in one callThe serializer will have to be modified around here https://git.ligo.org/computing/gracedb/server/-/blob/c6575e59e650449422dd65c87f4ffdcaa7bb4adb/gracedb/api/v1/superevents/serializers.py#L330-335 to accept `event` as a string (for backw...The serializer will have to be modified around here https://git.ligo.org/computing/gracedb/server/-/blob/c6575e59e650449422dd65c87f4ffdcaa7bb4adb/gracedb/api/v1/superevents/serializers.py#L330-335 to accept `event` as a string (for backwards compatibility), or a list and then loop over `add_event_to_superevent`, or alternatively modify `add_event_to_superevent` (see below).
Some considerations or questions that I don't have a good feel for yet:
1) Logging. If we were to just loop over `add_event_to_superevent`, then there would be a log message on the superevent for every event that gets added. There should still be a log message on each individual event, but maybe for superevents, there can be a "Added GXXX GYYY GZZZ" superevent.
2) Alerts. There are alerts that get sent out to event and superevent topics (https://gracedb-playground.ligo.org/documentation/igwn_alert.html#event-alerts) when events are added. The superevent alert (that contains the superevent packet) should be modified to show the events that were added. The question is event alerts. To remain consistent with the existing setup, there should be alerts for every event. Though if took out the event alert, it would make the response a lot faster and would make coding this up a lot easier. I wonder if any groups actually use that?
We should think about modifying `remove_event_from_superevent` as well (https://git.ligo.org/computing/gracedb/server/-/blob/c6575e59e650449422dd65c87f4ffdcaa7bb4adb/gracedb/api/v1/superevents/views.py#L171-174)Critical Path O4 DevelopmentDuncan MeacherDuncan Meacherhttps://git.ligo.org/computing/gracedb/server/-/issues/253Unique tag sets for inherited logs2023-02-14T18:39:34ZAlexander PaceUnique tag sets for inherited logsMy initial idea for tags for InheritedLogs was two have two sets of tags (the original `event.tags` set from the EventLog, and an additional `inheritedlog.superevent.tags` set) that are combined into one set that is queryable and filtera...My initial idea for tags for InheritedLogs was two have two sets of tags (the original `event.tags` set from the EventLog, and an additional `inheritedlog.superevent.tags` set) that are combined into one set that is queryable and filterable. Which is [easy enough](https://git.ligo.org/computing/gracedb/server/-/blob/fbc5e15e5564af51aad9506ca56b39d98a688129/gracedb/superevents/models.py#L587-589).
The problem arises from combining [`ManyToMany` sets](https://docs.djangoproject.com/en/3.2/ref/models/querysets/#values):
> Because ManyToManyField attributes and reverse relations can have multiple related rows, including these can have a multiplier effect on the size of your result set. This will be especially pronounced if you include multiple such fields in your values() query, in which case all possible combinations will be returned.
So in practice, on `gracedb-dev1` for a test InheritedLog:
```
In [23]: il
Out[23]: <InheritedLog: G414269 -> S230213b>
In [24]: il.source_event_log.tags.all()
Out[24]: <QuerySet [<Tag: Sky Localization>]>
In [25]: il.superevent_tags.all()
Out[25]: <QuerySet [<Tag: Public>]>
In [26]: combined_set = il.source_event_log.tags.all() | il.superevent_tags.all()
In [27]: combined_set.count()
Out[27]: 616735
In [28]: %time combined_set.count()
CPU times: user 0 ns, sys: 2.11 ms, total: 2.11 ms
Wall time: 1.35 s
Out[28]: 616735
```
By combining the querysets, the database is constructing a set of all the possible combinations of those tags, which is 600,000+ on dev1's small set of events and superevents, and it still takes over a second of wall time to count or construct a `.distinct()` set. I suspect on playground's massive database, it would absolutely destroy querying and rendering the superevent page.
I also tried the `*.union()` method to combine the tag sets, which is nearly instantaneous, but it [kills](https://stackoverflow.com/questions/50638442/django-queryset-union-return-broken-queryset-filter-and-get-return-every) the ability to `*.filter()`, or `*.get()` tags in the set... so that's a dealbreaker for querying and rendering the view.
So, I'm going to give up on adding new tags to InheritedLogs from the superevent (`InheritedLog.supervent_tags`) and ONLY have it inherit EventLog tags. We can revisit this if it becomes a dealbreaker, but it at least [looks like](https://gracedb-playground.ligo.org/events/G890530/view/) GWCelery is adding the `public` tag to the EventLog anyway, so it might all just work out.
![Screen_Shot_2023-02-14_at_10.58.03_AM](/uploads/42beee3be227fc8262ab5db942004265/Screen_Shot_2023-02-14_at_10.58.03_AM.png)
The `public` tag doesn't do anything on a g-event page, but I... think.... it might just work for exposing a superevent inherited log to the public.Critical Path O4 Developmenthttps://git.ligo.org/computing/gracedb/server/-/issues/255Add additional pipelines/searches for external events2024-03-10T16:42:58ZBrandon PiotrzkowskiAdd additional pipelines/searches for external eventsThere are a couple new types of external events that could be potentially ingested by gwcelery in O4, requiring the following tasks in advance:
- [ ] Add pipeline=`KamLAND` as part of https://git.ligo.org/emfollow/gwcelery/-/issues/72 (...There are a couple new types of external events that could be potentially ingested by gwcelery in O4, requiring the following tasks in advance:
- [ ] Add pipeline=`KamLAND` as part of https://git.ligo.org/emfollow/gwcelery/-/issues/72 (should use a new field search=`PreSN` since these should be treated distinctly from `Supernova`)
- [x] Add pipeline=`CHIME` and search=`FRB` (fast radio burst) as part of https://git.ligo.org/emfollow/gwcelery/-/issues/519, needed as well by the GRB/FRB/Magnetar group
- [x] Add pipeline=`SVOM` as part of https://git.ligo.org/emfollow/gwcelery/-/issues/539, which should use the already existing `search='GRB'`
- [x] Add pipeline=`IceCube` as part of https://git.ligo.org/emfollow/gwcelery/-/issues/750 and would need to add a new field such as `search='HEN'`
At the moment I don't have templates to instruct ingest either of these alert types, so that will likely have to be solved in a separate issue (in gwcelery temporarily and GraceDB in the long-run).
**Edit, Feb 20, 2024**
@brandon.piotrzkowski @andrew.toivonen @michael-coughlin I'd like to get a little more organized in handling these issues since there are a couple of requests floating around on this one ticket. Could you please complete the table below with the information that is missing. In particular, what is the input file upload format for each of these new pipelines, are modifications to existing upload formats needed, and then provide a link to a sample file (even if it means uploading it manually to this issue).
Also, could you work amongst yourselves to determine what the priority are for these (1, 2, 3, etc). This should be based on your assessment of each of the new pipelines' technical readiness, ie, the file format is settled and gwcelery is ready to test but you're just waiting on gracedb changes.
| Is Complete? | Pipeline Name | Search(es) Name | Input File Format | Changes needed? | Link to file | Priority |
| ------ | ------ | ------ | ------ | ------ | ------ | ------ |
| :x: | `KamLAND` | `PreSN` | unknown | unknown | unknown | 3 |
| :white_check_mark: | `CHIME` | `FRB` | `VOEvent` | no | n/a | n/a |
| :white_check_mark: | `SVOM` | `GRB` | `JSON` for now (can convert to `VOEvent` in `gwcelery` until they send in that format) | Ingestion via GraceDB (`VOEvent` for now but eventually `JSON`) | [`VOEvent` available](https://git.ligo.org/emfollow/gwcelery/uploads/786f205addb0045b18c96e8843f3af6f/sb23041100_eclairs-wakeup_2.xml) but no `JSON` yet | 2 |
| :x: | `IceCube` | `HEN` | `VOEvent` | Ingestion via GraceDB | [`Bronze`](https://git.ligo.org/emfollow/gwcelery/-/blob/0bcd7e638d220629d504dc66c7d5fd43cdd34d1c/gwcelery/tests/data/icecube_bronze_neutrino.xml) [`Gold`](https://git.ligo.org/emfollow/gwcelery/-/blob/0bcd7e638d220629d504dc66c7d5fd43cdd34d1c/gwcelery/tests/data/icecube_gold_neutrino.xml) | 1 |O4bhttps://git.ligo.org/computing/gracedb/server/-/issues/256Plan for archiving MDC data at CIT2023-04-13T12:12:08ZAlexander PacePlan for archiving MDC data at CIT**Context:**
Before the MDCs started, it was a policy on `gracedb-playground` to remove events and the associated data after 21 days. After some pushback from the low-latency chairs, that operation was suspended, and the whole of the M...**Context:**
Before the MDCs started, it was a policy on `gracedb-playground` to remove events and the associated data after 21 days. After some pushback from the low-latency chairs, that operation was suspended, and the whole of the MDC remains to be archived on `gracedb-playground`, in the cloud. The costs of storage in Amazon EFS notwithstanding, this has been a useful exercise from a GraceDB development and optimization standpoint: having multiple users and pipelines interact with a database that is stuffed with test events (there are approximately 3x more events and superevents in `gracedb-playground` than in the production system) has been invaluable to identify and fix some fundamental low-level performance bottlenecks (see: https://git.ligo.org/computing/gracedb/server/-/issues/249, https://git.ligo.org/computing/gracedb/server/-/merge_requests/95, https://git.ligo.org/computing/gracedb/server/-/merge_requests/96).
That being said, in the past two weeks, I have received three private communications over email and mattermost (@roberto.depietri, @shaon.ghosh, @geoffrey.mo, @gaurav.waratkar) regarding bulk-data transfers of MDC data from AWS to CIT. In debugging and optimizing low-latency operations over the past months, I have observed other periods of increased download and query activity as well, where users are moving large numbers of files (O(1,000)-O(10,000)) from AWS to various user accounts and headnodes at CIT. These periods of activity correlate with the beginning of new rounds of MDC, as I suspect users are analyzing data from the previous round.
There hasn't been a clear definition of what constitutes "fair use" of resources; GraceDB is sort-of just there for the collaboration to use so no individual user is at "fault" in this situation. That being said, these ad hoc data transfers do affect the performance of low-latency operations, and results in redundant storage and network traffic at CIT.
**Action Required:**
I am requesting that the low-latency chairs who initially requested that MDC data be retained (again, a worthwhile effort) coordinate with the admins at CIT for a permanent and organized transfer and archive of MDC data from AWS to CIT. This would involve (and I'm thinking off the top of my head):
1) deciding on a namespace on where to store the data (other than random users' home directories)
2) deciding on a system and folder hierarchy (GraceDB uses its own system which is obtuse to someone not using the database)
3) communicating to users in the various working groups that the MDC data is locally-available on the LDG to use instead of making 10,000's of requests to the internet
When it comes time to do the actual transfer, I can coordinate with the CIT admins to open up a security group to directly mount the EFS partition at CIT for a bulk rsync, if need be. There might be a better idea, I dunno.
@roberto.depietri, @shaon.ghosh: as we move into O4 low-latency operations, please coordinate with @stuart.anderson and @philippe.grassia to get MDC data out of the cloud and onto an LDG resource. If anyone tagged on this ticket has other proposals, please chime in.O4 Prephttps://git.ligo.org/computing/gracedb/server/-/issues/261Addition of Search tag for event uplaods from low-latency sub-solar mass sear...2023-03-15T16:13:15ZDivya SinghAddition of Search tag for event uplaods from low-latency sub-solar mass searches## Description of feature request
<!--
Describe your feature request!
Is it a web interface change? Some underlying feature? An API resource?
The more detail you can provide, the better.
-->
GstLAL and MBTA will run low-latency sub-sola...## Description of feature request
<!--
Describe your feature request!
Is it a web interface change? Some underlying feature? An API resource?
The more detail you can provide, the better.
-->
GstLAL and MBTA will run low-latency sub-solar mass searches in O4 which require a new search tag to differentiate events uploaded by these searches from the full bandwidth events i.e. `AllSky`. We propose using a new tag `Search: SSM` which hasn't been used previously by any pipelines for past searches. Currrently, both pipelines are using `Search:LowMass` eg. [GstLAL uploads here](https://gracedb-test.ligo.org/search/?query=gstlal+far+%3C+1+created%3A+2023-03-06+12%3A30%3A00+..+2023-03-08+20%3A40%3A00&query_type=E&results_format=S).
## Use cases
<!-- List some specific cases where this feature will be useful -->
- Differentiate between events uploaded from AllSky searches and SSM searches.
- Apply different thresholds on the GWCelery/LL pipelines side to send out alerts based on the search tag alone.
## Benefits
<!-- Describe the benefits of adding this feature -->
- This will allow specifying different alerts threshold in the simplest way.
## Drawbacks
<!--
Are there any drawbacks to adding this feature?
Can you think of any ways in which this will negatively affect the service for any set of users?
-->
## Suggested solutions
<!-- Do you have any ideas for how to implement this feature? -->
We propose adding a new tag `Search: SSM` for uploads from the low-latency sub-solar mass searches on GraceDB.O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/262Intermittent connection issues on gracedb-playground2023-05-09T17:51:26ZAlexander PaceIntermittent connection issues on gracedb-playground@rebecca.ewing reported some connection issues for the `gstlalcbc` user. The errors and timestamps are below:
```
“[Errno 111] Connection refused”
Feb 28 22:04 EST
“[Errno 110]”
March 4 20:31 PST
March 2 21:13 PST
March 2 20:53 PST
...@rebecca.ewing reported some connection issues for the `gstlalcbc` user. The errors and timestamps are below:
```
“[Errno 111] Connection refused”
Feb 28 22:04 EST
“[Errno 110]”
March 4 20:31 PST
March 2 21:13 PST
March 2 20:53 PST
Mar 2 20:04 PST
Mar 2 19:49 PST
Mar 2 18:42 PST
Mar 2 18:02 PST
Mar 2 16:42 PST
Mar 2 14:44 PST
Mar 2 13:56 PST
Mar 2 12:43 PST
Mar 2 13:56 PST
“HTTPSConnectionPool(host='gracedb-playground.ligo.org', port=443): Read timed out”
Mar 4 20:31 PST
Mar 2 13:56 PST
```O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/263Docker instructions2023-03-14T14:33:21ZMichael William CoughlinDocker instructionsJust building the Dockerfile on my M1 Mac:
docker build .
and then starting the container
docker run sha256:fe76c7602760c2ab2b1f7bd6ed59ed5ae4f641130765e
```
WARNING: The requested image's platform (linux/amd64) does not match the dete...Just building the Dockerfile on my M1 Mac:
docker build .
and then starting the container
docker run sha256:fe76c7602760c2ab2b1f7bd6ed59ed5ae4f641130765e
```
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
/bin/chown: cannot access '/app/db_data': No such file or directory
/bin/chmod: cannot access '/app/db_data': No such file or directory
kinit: Unsupported key table format version number while getting initial credentials
Error: Format string '%(ENV_ENABLE_IGWN_OVERSEER)s' for 'program:igwn_alert_overseer.autostart' contains names ('ENV_ENABLE_IGWN_OVERSEER') which cannot be expanded. Available names: ENV_DEBIAN_FRONTEND, ENV_ENABLE_OVERSEER, ENV_ENABLE_SHIBD, ENV_HOME, ENV_HOSTNAME, ENV_LC_CTYPE, ENV_LVALERT_OVERSEER_RESOURCE, ENV_PATH, ENV_PWD, ENV_PYTHONPATH, ENV_SHLVL, ENV_VIRTUAL_ENV, ENV_XDG_CACHE_HOME, group_name, here, host_node_name, program_name in section 'program:igwn_alert_overseer' (file: '/etc/supervisor/conf.d/igwn-overseer.conf')
For help, use /usr/local/bin/supervisord -h
```
I would guess the issue is the M1, which is fine, but I think a very short quickstart would be useful.BacklogMichael William CoughlinMichael William Coughlinhttps://git.ligo.org/computing/gracedb/server/-/issues/265query results depend on order of inputs2023-03-15T20:48:51ZRebecca Ewingquery results depend on order of inputs## Description of problem
<!--
Describe in detail what you are trying to do and what the result is.
Exact timestamps, error tracebacks, and screenshots (if applicable) are very helpful.
-->
I want to make fairly complex queries to grace...## Description of problem
<!--
Describe in detail what you are trying to do and what the result is.
Exact timestamps, error tracebacks, and screenshots (if applicable) are very helpful.
-->
I want to make fairly complex queries to gracedb for example search for events with certain single inspiral attributes, specify the pipeline, search, creation time, and FAR. For example, the following query should return all gstlal AllSky injection uploads from MDC11 below a far threshold:
```
si.channel = "GDS-CALIB_STRAIN_O3Replay" | si.channel = "Hrec_hoft_16384Hz_O3Replay" pipeline: gstlal created: 2023-02-17 00:00:00 .. 2023-03-28 00:00:00 far < 1e-8 search: AllSky
```
But this query returns seemingly any gstlal upload regardless of FAR and single inspiral attributes. If I change the order, then it works as expected:
```
far <= 1e-8 & (si.channel = "GDS-CALIB_STRAIN_O3Replay" | si.channel = "Hrec_hoft_16384Hz_O3Replay") pipeline: gstlal search: AllSky created: 2023-02-17 00:00:00 .. 2023-03-28 00:00:00
```
## Expected behavior
<!-- What do you expect to happen instead? -->
I would expect these queries to be order independent. And when there are multiple inputs given (ie FAR, pipeline, single inspiral attributes) I would expect them to all implicitly be joined by "AND" instead of "OR".
## Steps to reproduce
<!-- Step-by-step procedure for reproducing the issue -->
Query with "wrong order": [here](https://gracedb-playground.ligo.org/search/?query=si.channel+%3D+%22GDS-CALIB_STRAIN_O3Replay%22+%7C+si.channel+%3D+%22Hrec_hoft_16384Hz_O3Replay%22+pipeline%3A+gstlal+created%3A+2023-02-17+00%3A00%3A00+..+2023-03-28+00%3A00%3A00+far+%3C+1e-8+search%3A+AllSky&query_type=E&results_format=S)
Query with "right order: [here](https://gracedb-playground.ligo.org/search/?query=far+%3C%3D+1e-8+%26+%28si.channel+%3D+%22GDS-CALIB_STRAIN_O3Replay%22+%7C+si.channel+%3D+%22Hrec_hoft_16384Hz_O3Replay%22%29++pipeline%3A+gstlal+search%3A+AllSky+created%3A+2023-02-17+00%3A00%3A00+..+2023-03-28+00%3A00%3A00+&query_type=E&results_format=S)
## Context/environment
<!--
Describe the environment you are working in:
* If using the ligo-gracedb client package, which version?
* Your operating system
* Your browser (web interface issues only)
* If you are experiencing this problem while working on a LIGO or Virgo computing cluster, which cluster are you using?
-->
## Suggested solutions
<!-- Any ideas for how to resolve this problem? -->
Even if it's not possible/easy to make the queries more flexible in terms of order of options, it would be nice if the "rules" were documented so that users can look up how to write queries to get the expected results.O4 Debugging and ImprovementsDaniel WysockiDaniel Wysockihttps://git.ligo.org/computing/gracedb/server/-/issues/266Requests for reports page2023-04-04T07:57:03ZAlexander PaceRequests for reports pageA while back I set up the [beta reports page](https://gracedb-playground.ligo.org/reports/) on gracedb-playground that just showed the upload latency for all g-events uploaded to GraceDB over the last seven days. I had it up and running ...A while back I set up the [beta reports page](https://gracedb-playground.ligo.org/reports/) on gracedb-playground that just showed the upload latency for all g-events uploaded to GraceDB over the last seven days. I had it up and running and had thrown it out in various telecons, but never received feedback from it. I saw @rebecca.ewing using it during this morning's gstlal review call! I'm going to use this ticket to solicit requests for new features. I'm thinking:
- [ ] Variable date range (specify start and end times to make the query)
- [ ] Filter Early Warning events on/off or on a separate plot
- [ ] For superevents, plot time when `GCN_PRELIM_SENT` label applied less, the `t_0` of the superevent. Come up with a name for this parameter? `time_to_alert`?
- [ ] Plot `time_to_alert` vs number of g-events for a superevent? Something like mean and stddev.O4 Debugging and ImprovementsAlexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/271Detchar group for HardwareInjections2023-12-01T17:04:49ZSiddharth SoniDetchar group for HardwareInjectionsAlex pointed out that Detchar is not one of the groups for HardwareInjection pipeline. The currently available groups are `'CBC', 'Stochastic', 'Burst', 'Coherent', 'Test', 'External'`. I would like to request a `Detchar` group for the D...Alex pointed out that Detchar is not one of the groups for HardwareInjection pipeline. The currently available groups are `'CBC', 'Stochastic', 'Burst', 'Coherent', 'Test', 'External'`. I would like to request a `Detchar` group for the Detchar safety Injections in ER15 and O4.https://git.ligo.org/computing/gracedb/server/-/issues/274Out of range float values are not JSON compliant2023-04-27T08:53:41ZAlexander PaceOut of range float values are not JSON compliantYesterday (April 5) and today (April 6) I got notified about json encoding errors for a couple of pycbc test events:
```
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/django/core/handlers/exception.p...Yesterday (April 5) and today (April 6) I got notified about json encoding errors for a couple of pycbc test events:
```
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/django/core/handlers/exception.py", line 47, in inner
response = get_response(request)
File "/usr/local/lib/python3.7/dist-packages/django/core/handlers/base.py", line 204, in _get_response
response = response.render()
File "/usr/local/lib/python3.7/dist-packages/django/template/response.py", line 105, in render
self.content = self.rendered_content
File "/usr/local/lib/python3.7/dist-packages/rest_framework/response.py", line 70, in rendered_content
ret = renderer.render(self.data, accepted_media_type, context)
File "/usr/local/lib/python3.7/dist-packages/rest_framework/renderers.py", line 103, in render
allow_nan=not self.strict, separators=separators
File "/usr/local/lib/python3.7/dist-packages/rest_framework/utils/json.py", line 25, in dumps
return json.dumps(*args, **kwargs)
File "/usr/lib/python3.7/json/__init__.py", line 238, in dumps
**kw).encode(obj)
File "/usr/lib/python3.7/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.7/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
Exception Type: ValueError at /api/events/
Exception Value: Out of range float values are not JSON compliant
Request information:
USER: prasia.p@ligo.org
GET: No GET data
POST:
search = 'AllSky'
pipeline = 'pycbc'
eventFile = 'text/xml'
group = 'Test'
offline = 'True'
```
An example of the event is here: https://gracedb-playground.ligo.org/events/T979324/view/
There are a couple of `nan`s in the coinc upload that are causing problems with json serialization.
I think gracedb is robust enough to catch the `nan` and then store it in the database, but when the event gets serialized to json for alerts and http responses, the error pops up. For example, if one were to look at the `api` view for that event, the user would get the error instead of a json serialization (plz don't).
Also it would cause errors in nagios/dashboard, because it tries to pull the json packet of the latest event, but if one of these is the latest event, then it gets an error instead.
@prasia.p @tito-cantonO4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/278Ability to upload multiple files when creating an event2023-04-12T14:01:11ZTito Dal CantonAbility to upload multiple files when creating an event## Description of feature request
It would be useful to have the ability to upload multiple files when creating a G event.
## Use cases
CBC searches currently upload a LIGOLW XML file at event creation, followed by two JSON files con...## Description of feature request
It would be useful to have the ability to upload multiple files when creating a G event.
## Use cases
CBC searches currently upload a LIGOLW XML file at event creation, followed by two JSON files containing the EM-bright and p_astro information. At least some of the searches also upload a few other files, for example diagnostic plots.
## Benefits
The first benefit is convenience: GraceDB products are uploaded with a smaller number of REST/API calls, ideally just one, making the code simpler.
I suspect there would be two other benefits, though I do not know enough about the server code to judge if they are realistic or not:
* Robustness: reducing the number of HTTP requests might reduce the probability of a failure (e.g. due to a network glitch) and make the event creation more "atomic", in the sense of guaranteeing that if an event is created, it will have all the necessary files.
* Latency: currently each file upload adds order 1 s of latency, and occasionally much more. Transferring everything in a single request might help with that.
## Drawbacks
Apart from the obvious implementation burden, I cannot see any at the moment.
I suppose an alternative to an API change would be to design a file format which could actually communicate all the search information in a single file. There has been discussion in the past about storing the p_astro information in the LIGOLW file, for example, though that idea appears to have been shelved. Given how complicated it is to change established file formats, though, I think this feature request is still reasonable.
## Suggested solutions
I forget at the moment if HTTP requests support multiple files. If so, the feature seems easy to implement. Otherwise, one could come up with a simple data structure (e.g. JSON) that encodes the list of (file name, file content) pairs, and upload that as a file, though that may require some post-processing to "expand" the JSON back into the list of individual files on the server side.O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/279Partially uploaded MDC events2023-07-20T15:51:46ZTito Dal CantonPartially uploaded MDC events@max.trevor reports a "partially uploaded" PyCBC event from the MDC:
https://gracedb-playground.ligo.org/events/G967313/view/
This is a PyCBC Live full-bandwidth event resulting from the SNR optimization of the initial discovery event ...@max.trevor reports a "partially uploaded" PyCBC event from the MDC:
https://gracedb-playground.ligo.org/events/G967313/view/
This is a PyCBC Live full-bandwidth event resulting from the SNR optimization of the initial discovery event (G967295). If we expand the full event log, we can see that the search uploaded the initial LIGOLW data, and nothing else. The event is missing the EM-bright and p_astro JSON files, the diagnostic plots, and various comments that explain the event in a human-friendly way. Compare for example with the full event log of
https://gracedb-playground.ligo.org/events/G998023/view/
Unfortunately, we discovered G967313 too late, and the error log has now been rotated away to diagnose what exactly happened from the search side.https://git.ligo.org/computing/gracedb/server/-/issues/283Alert notification form silently fails if label query invalid2023-04-20T22:22:08ZDaniel WysockiAlert notification form silently fails if label query invalid## Description of problem
<!--
Describe in detail what you are trying to do and what the result is.
Exact timestamps, error tracebacks, and screenshots (if applicable) are very helpful.
-->
When entering a `Label query` in the Notificati...## Description of problem
<!--
Describe in detail what you are trying to do and what the result is.
Exact timestamps, error tracebacks, and screenshots (if applicable) are very helpful.
-->
When entering a `Label query` in the Notification create/edit forms, the validation step works correctly in stopping you from entering an invalid query. However, it does not provide any message about what validation failed, or even that it failed, the page just flickers for a second while it reloads.
## Expected behavior
<!-- What do you expect to happen instead? -->
There should be a message indicating why it failed to validate, e.g., `Invalid label query`. Ideally we'd also link to the [docs on creating label queries](https://gracedb.ligo.org/documentation/notifications.html#creating-a-notification)
## Steps to reproduce
<!-- Step-by-step procedure for reproducing the issue -->
- Go to https://gracedb.ligo.org/alerts/notification/create/
- Fill out the description, select a contact, and then enter anything invalid in `Label query` (e.g., `f00b@r`)
- Click submit
## Context/environment
<!--
Describe the environment you are working in:
* If using the ligo-gracedb client package, which version?
* Your operating system
* Your browser (web interface issues only)
* If you are experiencing this problem while working on a LIGO or Virgo computing cluster, which cluster are you using?
-->
* OS: Arch Linux
* Browser: Firefox 112.0.1 (64-bit)
* Note: this was tested on gracedb-dev.ligo.org
## Suggested solutions
<!-- Any ideas for how to resolve this problem? -->
We already seem to have validation messages for the `Contacts` field on that page. We should just do whatever we did for that.https://git.ligo.org/computing/gracedb/server/-/issues/286Creating Test External Events on GraceDB-Playground2023-07-28T14:05:10ZRyan FisherCreating Test External Events on GraceDB-PlaygroundI would like to either create test External GRB events on GraceDB-Playground or set up a system where I am able ask for some to be created, if needed.
The need for these events is that the search I am running requires External short GRB...I would like to either create test External GRB events on GraceDB-Playground or set up a system where I am able ask for some to be created, if needed.
The need for these events is that the search I am running requires External short GRB events to appear in the database, such that they overlap with observing mode data. This triggers the medium latency PyGRB search to run. I would replicate previous GRB events, with updated event times such that the events overlap with the observing mode data.
I am not attempting to submit new GW events. Group would be External, Pipeline would be Fermi. Search would not be applicable, etc. It would be exactly like the External event here: https://gracedb-playground.ligo.org/documentation/models.html
I would like to learn how to submit these events (just a pointer to the correct instructions would be fine) and how to get authorization (and authentication) to do so.
If there is already a guarantee that new short GRB events will appear in GraceDB-Playground overlapping with ER15 at a rate of at least 1 per day, then this request can be closed.
Thank you!Alexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/289Proposal to hide exposed hourly MDC superevents on production2023-05-23T21:06:54ZAlexander PaceProposal to hide exposed hourly MDC superevents on production**Description:** Moving into O4, I've been monitoring the load on the production database, and I noticed that the highest load on the database (over two OOM cpu usage over other requests) occur under a very specific circumstance: when an...**Description:** Moving into O4, I've been monitoring the load on the production database, and I noticed that the highest load on the database (over two OOM cpu usage over other requests) occur under a very specific circumstance: when an _unauthenticated_ user makes a request to view _public_ data products. An example would be, when a member of the public views a public superevent page, or a script scrapes for public skymaps, etc.
I traced this down to the SQL that's generated by a `django-guardian` function called `get_objects_for_user`. There has to be an underlying bug with GraceDB's public `viewexposed` permission, but I haven't been able to find it yet.
That being said, there are a couple of [stackoverflow](https://stackoverflow.com/a/19444128) posts and github issues about this function and this statement is accurate to me:
> Also, if possible, i suggest you don't use get_objects_for_user shortcut when project gets bigger. Its VERY slow query once you get more objects/permissions in the database.
:arrow_up: that seems consistent with some [testing](https://git.ligo.org/computing/gracedb/server/-/issues/249#note_689232) that i've seen this week.
So why wasn't this an issue before? At the end of O3, there were 80 exposed (public) superevents. That's a trivial number of items from a database standpoint. But in the three years since O3 ended, the hourly first-two-years MDC uploads have been exposed to the public. Multiply 24 daily superevents by three years and all of a sudden....
```
In [11]: Superevent.objects.filter(is_exposed=True).filter(category='M').count()
Out[11]: 35354
```
There's over 35,000 exposed superevents and growing by the hour.
A quick test can be to open this file list: https://gracedb.ligo.org/superevents/S200316bj/files/
as an authenticated user (243ms):
![Screen_Shot_2023-05-03_at_11.37.54_AM](/uploads/538d6706b93b2e0b10593b03e72b5c0d/Screen_Shot_2023-05-03_at_11.37.54_AM.png)
and in incog (13.5s :sob:):
![Screen_Shot_2023-05-03_at_11.39.53_AM](/uploads/9d0d798392f395b6467e2a88670b50a9/Screen_Shot_2023-05-03_at_11.39.53_AM.png)
**Proposal:**
1) Unless there are objections, I'm going to hide exposed MDC uploads and see the performance impact.
2) If it works, then I'm going to set up a tool to hide all (or a subset..?) of MDC superevents (which is a bandaid)
3) Figure out what's wrong with the permissions, because finding the bug might have other wider-ranging performance implications
4) Unless there is the desire to have the test uploads public, then modify GWCelery not to expose the test uploads. We can revisit this request based on the results of 1-3.Critical Path O4 DevelopmentAlexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/291VersionedFile symlink inconsistency2023-05-04T19:38:30ZAlexander PaceVersionedFile symlink inconsistencyWhen a user uploads multiple files that have the same filename within an exceeeeedingly small time window, there's a chance that the [block of code](https://git.ligo.org/computing/gracedb/server/-/blob/d32071c941c905a13f043dbec16fa41d0fd...When a user uploads multiple files that have the same filename within an exceeeeedingly small time window, there's a chance that the [block of code](https://git.ligo.org/computing/gracedb/server/-/blob/d32071c941c905a13f043dbec16fa41d0fd9bfb4/gracedb/core/vfile.py#L102-110) that creates a symlinked version file can hit a race condition.
This happens pretty rarely, but whenever it does, it's always from gwcelery uploading multiple circular templates, which is a [known](https://git.ligo.org/emfollow/gwcelery/-/issues/480) [bug](https://git.ligo.org/emfollow/gwcelery/-/issues/616) that's being addressed.
That being said, examining the files in question in this superevent [S230504an](https://gracedb-playground.ligo.org/superevents/S230504an/view/):
![Screen_Shot_2023-05-04_at_3.31.12_PM](/uploads/50bfe26f2ba327d66f5969e70f0b4d38/Screen_Shot_2023-05-04_at_3.31.12_PM.png)
The file versioning seems to have worked like it should have? And the symlink seems to be pointing at the right file? But honestly it's difficult to tell when there are so many duplicates of the same file. So I don't know if the Error that Brian Moe raised in that routine is correct.... or if there was a brief moment in that superevent's timeline when the symlink was inconsistent with the intended file, or if that broken symlink was fixed the next time a new file came in, or if it's still broken and just pointing to the wrong file (which happens to be the same?).
Given that, and that it only occurs during the gwcelery bug that's going to get fixed, I'm kind of afraid to touch it without knowing what's really going on and having a good way to test it.O4 Debugging and Improvementshttps://git.ligo.org/computing/gracedb/server/-/issues/293Allow an easy deployment on k8s infrastructure (gracedb-test01.igwn.org/minik...2023-05-16T12:45:53ZRoberto DePietriAllow an easy deployment on k8s infrastructure (gracedb-test01.igwn.org/minikube)To follow the advice of the LLAI reviewer, we should allow easy deployment of gracedb server code on k8s using the created docker container.
- Brainstorming on LLAI tiers and local development. [DCC](https://dcc.ligo.org/LIGO-G2300724) ...To follow the advice of the LLAI reviewer, we should allow easy deployment of gracedb server code on k8s using the created docker container.
- Brainstorming on LLAI tiers and local development. [DCC](https://dcc.ligo.org/LIGO-G2300724)
- Telecon technical call [2023 05 01](https://git.ligo.org/emfollow/gwcelery/-/wikis/telecons/2023-05-01)
- Standalone GraceDB test instance with Minikube [dcc](https://dcc.ligo.org/LIGO-G2201921)
- old merge request [link](https://git.ligo.org/computing/gracedb/server/-/merge_requests/61)
-- To be completed with the requirements ----
Associate merge request:
1. https://git.ligo.org/computing/gracedb/server/-/merge_requests/130
1. https://git.ligo.org/computing/igwn-alert/overseer/-/merge_requests/3https://git.ligo.org/computing/gracedb/server/-/issues/297Support other file formats (other than XML VOEvent) to ingest external events...2024-03-15T17:08:38ZBrandon PiotrzkowskiSupport other file formats (other than XML VOEvent) to ingest external events fromCurrently we can only ingest VOEvent XMl files to create external events via `gracedb.create_event`. There is already a need ingest events delivered via Kafka that have a `.json` format, which we are currently creating a workaround by co...Currently we can only ingest VOEvent XMl files to create external events via `gracedb.create_event`. There is already a need ingest events delivered via Kafka that have a `.json` format, which we are currently creating a workaround by converting to a VOEvent packet here (note this code has not been merged yet and subject to change):
https://git.ligo.org/emfollow/gwcelery/-/blob/dfdd84a97dec60257c4d7bd91d6c0c9442ec3de6/gwcelery/tasks/external_triggers.py#L561-626
Example alert:
https://git.ligo.org/emfollow/gwcelery/-/blob/a35b3ba998ab4726f90d5fb3cdf87d365cccbc65/gwcelery/tests/data/kafka_alert_fermi.json
In general we should make a more flexible system to ingest external events as GCN moves towards Kafka, potentially able to add new notice types/formats as needed (e.g. Kamland notices also have a different format, etc.)
I assume we need to make additional parser functions such as [`populateGrbEventFromVOEventFile`](https://git.ligo.org/computing/gracedb/server/-/blob/79c9b1ead0086fc4789a32c597b84c7abaee9513/gracedb/events/translator.py#L646) and add options to use the and determine which schema is being used [here](https://git.ligo.org/computing/gracedb/server/-/blob/79c9b1ead0086fc4789a32c597b84c7abaee9513/gracedb/events/translator.py#L352).
I can personally help with this development if needed, especially after the start of O4.https://git.ligo.org/computing/gracedb/server/-/issues/301What is the plan to anticipate and mitigate future problems like those seen w...2023-06-07T14:18:08ZPeter CouvaresWhat is the plan to anticipate and mitigate future problems like those seen with S230518h (DB being overwhelmed by internal LVK or external public page loads)?First, thanks @alexander.pace for the quick id and fix of this problem. (If there is a postmortem, or relevant tickets, or even LIGO Chat URL of the debugging to link to, please add them to this ticket here for context.)
**What is the ...First, thanks @alexander.pace for the quick id and fix of this problem. (If there is a postmortem, or relevant tickets, or even LIGO Chat URL of the debugging to link to, please add them to this ticket here for context.)
**What is the plan to anticipate and mitigate future problems like those seen with S230518h (GraceDB being overwhelmed by internal LVK or external public page loads)?**
This plan should probably include:
1. Load specification & testing
- Define a spec / requirements for how many simultaneous public & private requests (and of what sort) a production GracedB instance should be able to respond to with a certain latency.
- As part of the CI process, perform synthetic load testing w/simulated public & private users up to the spec and ensure it passes.
- Outside of CI, perform synthetic load testing w/simulated public & private users _beyond_ the spec to understand where it fails and why.
- Do some cost/benefit analysis of preemptive fixes to the bottlenecks identified in the beyond-spec load testing, so we can decide whether to fix them now or wait until it's known to be necessary.
2. Document an emergency procedure for an overwhelmed DB that someone other than Alex can execute.
- How to identify when the DB is overwhelmed (vs. other problems – AWS down, network down, auth down, etc.)
- How to temporarily turn off public access (and turn it back on again, and how/when to decide)
- Can/should we give privileged access to certain MM partners in such an emergency, so they can follow up?
- Anything else that can temporarily speed things up in a pinch.
I would expect this to be part of Phase II of the O4 LLAI review. (If there is a milestone or tag you'd like to use for such LLAI review tasks, please add it to this ticket so we can find them all easily in in the future.)Alexander PaceAlexander Pacehttps://git.ligo.org/computing/gracedb/server/-/issues/302investigation of unauthorized (public) queries (get_objects_for_user)2024-03-29T15:45:32ZAlexander Paceinvestigation of unauthorized (public) queries (get_objects_for_user)It's been established here: https://git.ligo.org/computing/gracedb/server/-/issues/249#note_689232 that unauthorized queries. Context: there's one call coming from `django-guardian` called `get_objects_for_user` that takes in a user, a p...It's been established here: https://git.ligo.org/computing/gracedb/server/-/issues/249#note_689232 that unauthorized queries. Context: there's one call coming from `django-guardian` called `get_objects_for_user` that takes in a user, a permission (like "view log"), and a list of objects, and it returns a subset of those objects that a user can actually see. Please see this ticket: https://git.ligo.org/computing/gracedb/server/-/issues/289
I'm going to document the process for making this call faster. I think it's going to be two steps:
1) Mitigation- reducing the number of objects that this function has to filter. Also see the above ticket.
2) Optimization- we very well might be calling this function sub-optimally. So after the first step, see what we might be doing wrong.Alexander PaceAlexander Pace