Refactor and improve manual_rsync scripts
The manual_rsync_{GEO,LHO,LLO,VGO}.sh scripts (in https://git.ligo.org/computing/dqsegdb/server/-/tree/master/bin) were originally written as temporary stop-gaps for the time in mid-March 2019 when CIT admins had shut down the dqxml_pull_from_obs
service, in preparation for changing to a new system, which ended up being pushing files from the IFOs, and no DQXML files were going to be delivered for several days. Without that service or a replacement, GEO and Virgo files weren't going to be delivered at all, so the Virgo script continued to be used through the end of O3 and was restarted for ER15, and the GEO script has been in continuous use since mid-March 2019. In April 2023, it was decided to change from pushing files from the IFOs (LHO and LLO) to pulling files by rsync, so the LHO and LLO scripts have been put back into service. [add link to ticket regarding this change]
The scripts were written to be dependable, but the code is a bit clunky, it has a small chance of missing a file or a few at the end of a DQXML file dir (very low probability, but not 0, and another script would catch any such gap and fill it in), new dirs might not be detected for up to 5 minutes after they are create on the remote filesystem, and there are 4 separate scripts, so bug fixes and improvements have to be replicated 4 times.
The scripts should be replaced with a single script that refactors the code for efficiency, handles all 4 IFOs and is easily extensible to more IFOs, and includes some improvements, as listed below. Note that with a single script, it is much easier to make improvements one at a time.
Improvements and bug fixes:
- look for a new DQXML dir whenever an rsync pull finds no new files, rather than counting rsync cycles and checking every N cycles (currently set to 5) (except for GEO, which only has new files every 30 min)
- after a new dir is detected, pull to the old dir, then create and pull to the new dir (currently pulls to current dir, then checks for new dir, and if found, pulls to that dir, so there is a tiny chance of a new file being saved to the old dir after the rsync, then a new dir being created, which is found in the check for a new dir, so the new file in the old dir is never rsync'd by this script) [update: this can't happen because a bug causes the script to delay for 5 cycles before pulling from the new dir, which is a minor problem in itself - the first files in a new dir are not transferred for ~5 minutes; this caused only a very minor occasional log note for GEO and hasn't been noticed for Virgo, which only restarted DQXML file production after the start of ER15, in April 2023, but with LHO and LLO using their scripts again, it has become a noticeable issue]
- if an error is encountered, delay by some amount of time (20 sec?) before trying again, rather than trying again immediately, or perhaps try a few times immediately, then add delays, to avoid excess traffic to the remote server, which might be having trouble