KAGRA updated some already-published flags/segments
From Takahiro Yamamoto, in a 2023.08.15 email ("Correction of KAGRA's O4a segments"):
We had checked KAGRA’s O4a segments one-by-one and found some mistakes in auxiliary segments. So we are now trying to reproduce some of segments during May 24 - Jun 21 when KAGRA have been in observing mode. We can finish to reproduce them soon.
Do you have good ideas to update KAGRA's segment infromation on DQSEGDB? Should we prepare a corrected segments as new files on our server (gwdet.icrr.u-tokyo.ac.jp http://gwdet.icrr.u-tokyo.ac.jp/), or can we overwrite old incorrect segment files?
Robert came up with a response, consulted with Ryan, then replied to Takahiro (2023.10.17):
We can definitely find a way to handle these updated segments. How we handle it is partly up to you. A few thoughts:
- As a policy, we never delete segments from DQSegDB, so that is not an option. The first question is whether some published segments were wrong and need to be removed or whether the published segments are correct, but they're incomplete.
- If there are published known and/or active segments that need to be removed, there are a few options:
- Publish a new version of the flag (e.g., K1:GRD_UNLOCKED:2), with all of the old segments that are correct republished, the incorrect segments left out, and corrected segments replacing the incorrect ones. All new segments will be published only to version 2 after this, and all analyses will have to query version 2 of the flag.
- Publish a new version of the flag (e.g., K1:GRD_UNLOCKED:2), but only publish corrected segments for the times when they're needed. All new segments will still be published to version 1. Virgo does something like this (e.g., V1:ITF_LOCKED:{1,2}). All analyses will need to query both versions of the flag and combine them or do a cascading query, which takes all versions of the flag and combines them to use the highest version for each time that a version > 1 is available.
- Publish an entirely new flag, using version 1 for that flag, either with all segments for the previous flag published for that flag or only publishing segments from the old flag from a certain point forward, and stopping production of the old flag and only producing the new flag in the future.
- If there were just some time periods when there are additional segments for a flag, so that none of the old segments need to be deleted, there are a few options:
- Publish a new version of the flag (e.g., K1:GRD_UNLOCKED:2), just like option 1 (or 2) above.
- We could consider that "gap filling" for the flag (as discussed here: https://wiki.ligo.org/Computing/DQSegDBSegmentGaps#Avoidable_gaps ) and simply publish the new segments to the current version of the flag. New query results will just be the union (total combination) of all published segments, regardless of whether they were published at the time or later on. This would require a little extra work, since queries to that flag before and after the "gap filling" publishing would now produce different results, so notification would need to be sent out to anyone who might be running analyses, and we would probably want to create a flag to mark times when new segments were published, but it would leave the flag versions unchanged.
- In any of those cases, it is strongly recommended to not update existing files in existing directories, since those are a permanent record of the source of published segments, and additionally, updates to those files would not trigger publishing of the updated files. It would be better to create a new directory for each flag that needs to be updated, and we will run some special tasks to publish those segments, whether they're updated segments for an existing flag and version ("gap filling"), a new version for an existing flag, or an entirely new flag.
and
Second, I noticed that several new directories showed up in the daily rsync transfers on 26 Sept. 2023 (27 Sept. in JST): K1-DAQ-IPC_ERROR/2023/ K1-DET_FRAME_AVAILABLE/2023/ K1-ETMX_OVERFLOW_OK/2023/ K1-ETMX_OVERFLOW_VETO/2023/ K1-ETMY_OVERFLOW_OK/2023/ K1-ETMY_OVERFLOW_VETO/2023/ in addition to the normal 6 directories: K1-GRD_LOCKED/2023/ K1-GRD_PEM_EARTHQUAKE/2023/ K1-GRD_SCIENCE_MODE/2023/ K1-GRD_UNLOCKED/2023/ K1-OMC_OVERFLOW_OK/2023/ K1-OMC_OVERFLOW_VETO/2023/ I also noticed that on that day, there were files transferred for 2023.05.24-2023.06.20 + 2023.09.27 in each of the 12 directories. Does that mean that the original files from 2023.05.24-2023.06.20 were replaced by corrected files? Note that the publisher detected more new files than expected on that date (12, instead of 6), so it didn't publish any of the new files. New files have been transferred every day since that day, but no new segments have been published, so we still have time to sort this out.
There were plans to set up a telecon, to discuss the issue, but that hasn't happened yet.