File backend improvements (!178) · Merge requests · lscsoft / iDQ

Patrick Godwin requested to merge file_backend_improvements into master Oct 15, 2019

This merge request is intended to mitigate NFS issues with running iDQ in production by how we access files. The two main things this changes are:

Change file access pattern:
- previous: write to final location -> copy to tmp file -> move to /dev/shm
- now: write to temp directory -> copy to /dev/shm -> move to final location

The idea here is that the heavy write operation is done in a local drive. The copy (if we request this) is also to a local disk to /dev/shm. The only operation that interacts with NFS is a move, and that should be instantaneous on our end.

Remove the nested directory structure to store data products.
- previous: train-gstlal_run5/START-12382/1238290518_1238292973/ANN1train-1238290518-2455.pkl
- after: train-gstlal_run5/START-12382/ANN1train-1238290518-2455.pkl

The point of doing this is to avoid the massive number of directories created by idq-timeseries. The directory structure is a bit redundant, since the filename contains the same information as the directory 1238290518_1238292973. For timeseries, it would avoid creating a directory for every frame file, which also avoids some of NFS issues we've been having.

As a result, I had to make a few modifications to make this work:

Add tmpdir property to DiskReporter, which takes in an optional tmpdir option from the INI file. If not set, looks for a TMPDIR environment variable. Else, defaults to /tmp.
Modify DiskReporter.report() to make the file access pattern change.
Modify DiskReporter.directory() to point at the new directory structure.
Modify DiskReporter.glob() to be able to access files correctly.
Add glob2path in names.py, used in DiskReporter.glob().

Also some house cleaning:

Remove a duplicate report() in SegmentReporter
Remove unused reporters JSONMetricReporter and DruidReporter

This has been tested in batch and streaming jobs, found no issues with operation or in creating reports.

File backend improvements

Merge request reports