Dataset consistency / gap-checker
In the event of an end-point outage or k8s cluster issue, online frame file registration will have to catch up with the frames produced during the outage. Doing so using the online registration deployments will delay registration of data in real time.
To solve this, implement standalone utilites to check for gaps in the registered datasets with respect to diskcache dumps. This can also be used to register bulk datasets offline and even run periodically to ensure complete coverage.
Basic architecture / usage:
- Reconcile frame file lists from diskcache and rucio catalog to find missing files / datasets
- Construct k8s and/or HTCondor jobs to efficiently register missing files (efficiently = e.g. N jobs, each registering 100 frame files)