Proposals and comments for public release of data products
Ticket to track discussion regarding O4 release of low-latency data products.
Please refer to the minutes from day-one of the low-latency virtual face-to-face on September 15, 2021 here.
In particular, the way public exposure of data products works currently is:
- Only information on a superevent's page is made public. The includes the web front-end and queries from the API.
- G-event pages are not made public. The result of the F2F discussion on 2021/09/15 seemed to indicate that it should remain that way.
- Superevent properties (such as FAR) are inherited from the superevent's preferred event. For reference: a superevent's preferred event is a one-to-one relationship with an event in the database. See here: (https://git.ligo.org/lscsoft/gracedb/-/blob/master/gracedb/superevents/models.py#L92). As such, when a superevent defines its FAR is just a relationship with its preferred event, for example: https://git.ligo.org/lscsoft/gracedb/-/blob/master/gracedb/superevents/models.py#L431
- Data products, such as skymaps, p_astro and such, are uploaded for a superevent and given a couple of tags, such as
skyloc
, which gives the file its own section on a superevent landing page:
- Adding a
public
tag will, understandably, make that file (log message) public by having the Django template generator filter based on that public tag (example: https://git.ligo.org/lscsoft/gracedb/-/blob/master/gracedb/templates/superevents/preferred_event_info_table_public.html) - Similarly, there is a filter of the
public
tag, as well as an authentication check that takes place that selectively returns information via API calls. - A superevent is a super-set of G-events. This is defined by a ForeignKey database relation (here: https://git.ligo.org/lscsoft/gracedb/-/blob/master/gracedb/events/models.py#L189). This relationship can be created or destroyed by
superevent_manager
s. There's no internal logic deciding this relationship.
Okay. So that's how it works right now. What I propose to guide the discussion is this:
- What information or products do you want from what events to be made public? Are these event properties, such as FAR and SNR, are there skymaps or other data products?
- Are these G-event properties coming from G-events are are not already part of a superevent's event relationship? In other words, outside events that are not part of a superevent's event list? If the superevent and event are already linked, then from a technical standpoint, the superevent's landing page and API call can return the information to the public. This is a policy limitation, not a technical limitation. I do not dictate policy.
- For data products such as skymaps, omega scans, and whatnot, the files are attached to log message objects in the database. Currently, the log message objects are linked via a one-to-one relationship in the database to either a superevent or a g-event.
Elaborating on the last point: concerning data products and files. What I propose from a technical standpoint would be expanding the log-message (and by extension, file product) relationship to include a many-to-many relationship. What this would mean is, a log message and file would be created for a g-event. That's relationship number one. A superevent manager process would determine if that product should be displayed on the superevent page. A new log message is created for the superevent, and that log message would have a linked relationship with the g-event's log message. The set of tags that are usually applied to a superevent's log message (skyloc
, public
, etc) are applied to the inherited log message, then boom, it's public.
I think this approach is a way to start and guide the discussion. I believe it has the advantage of:
- Allowing superevent managers to selectively choose what information and what products are inherited from what g-events.
- The mechanism for making data products public is unchanged.
- Adding data products to a superevent (not just public products) is reduced to one API call instead of the convoluted data-transfer bonanza that took place in O3. This is analogous to the "server-side copy" that was discussed in O3b but never brought to fruition.
- The existing superevent pages, log messages, public views, and event relationships will remain unchanged. Unless a superevent manager changes them.
I see this as a sound technical approach, but I am open to other suggestions. From a policy standpoint, it is up to @erik-katsavounidis and @shaon.ghosh to determine what information from what events and at what time should be linked to a superevent.