Add kwargs to filter ClassifierData.triggers() by segments/columns
Implements reed.essick/iDQ#35.
In particular:
- Allows the return from
ClassifierData.triggers()
to actually return the data requested, rather than the entire data store. - Modify the
is_cached
method to make use of a private cache metadata property,_cached_data
which is based off of asegmentlistdict
. It is a dictionary of segmentlists keyed by channel, and has nice properties to deal with set logic on all members of that dictionary. Already included as part of theligo-segments
package, so no new dependencies. - In order to return subsets of the full data store, we actually need to return a copy of the data when
triggers()
is called. That addresses theFIXME
you had before out of necessity. - I've added a new property
_time_column
to deal with filtering by segments, otherwiseClassifierData
has no notion to filter by times. To be honest, this is probably the place we should be storing specific trigger backend stuff for time and significance columns anyways since it knows about the columns it's supposed to contain, but I'll leave the full propagation of this for another time to not rock the boat. - Moved column information for gstlal-based features in
utils.py
, similar to what's done for KW-based features.
@reed.essick, I've assigned you to this since I am modifying some really base-level stuff in iDQ. Feel free to unassign/reassign yourself from some of the other merge requests if you feel it's getting a bit much.
Update:
- Fix
KafkaClassifierData
to raise aNoDataError
if no data results from querying the Kafka topic. This is handled better downstream inStreamProcessor
where it can catch these errors. The behavior is essentially the same as before, but it's a better way of handling missing data rather than the hack I had before.
Edited by Patrick Godwin