New file format for PyCBC
Created on July 3, 2018. Copied from redmine (https://bugs.ligo.org/redmine/issues/6186)
Discussion with Alex Nitz and Tito Dal Canton on June 22, 2018:
PyCBC wants to move away from .xml files to something like JSON or hdf5. They also want to be able to do everything in a single upload, including PSDs, metadata, etc. (not sure why, or if we should do that). It sounds like hdf5 would be preferred.
They also want some way to upload dynamic attributes from their analysis. This sounds nearly impossible with a relational database with a fixed schema. One option I thought of was to add two columns like "dynamic_field_name", "dynamic_field_value" whose contents could change. But we would still have to specify the field type (char, float, etc.) which would restrict things quite a bit. I promised to do some research into this.
Doing this work would be broken up into two parts:
- Determining current requirements for pycbc uploads, translating them to the pycbc group, and determining the structure of the new file format
- Writing translator code for the new file. We should do something like what has been done for CWB (but much nicer) where there is a "translator" class which reads the data and assigns event attributes. Alex thought that they may be able to help with this part if it would be useful for us and would help speed things up.
Questions:
- Should they still be allowed to submit .xml files? Or will we just do hdf5 going forward?