Skip to content

Gstlal features schema change

Patrick Godwin requested to merge gstlal_feature_api_change into master

This merge request handles the schema change to the gstlal features to comply with expectations from gwtrigfind. In particular, this means:

  • Changing time column: trigger_time -> time
  • Removing rows with nans in them

Changing the trigger_time column is trivial, just means to change this throughout in the codebase + tests. Removing the nan rows is ultimately a good thing, but means that there's some modifications in _retrieve_triggers() in how much space is preallocated for the triggers.

Instead of always assuming the number of rows is fixed, we instead grab all the datasets contained in a single channel group (dataset names are formatted as start_duration), keep only the ones that intersect with the segments requested (a way of roughly removing the number of rows to be read in), then calculate the livetime associated with the dataset spans to figure out how many rows to preallocate.

The two other things I modified are in utils.py:

  • Changed the default columns in gstlal features. There were some columns we just don't output anymore (start_time, sigmasq), and I added a duration column.
  • start_dur2start_end(): added a type_ kwarg which defaults to int. This is to allow type_ to be float in converting the dataset names in gstlal features to segments.
Edited by Patrick Godwin

Merge request reports