Dense quiver speedups
This merge request tackles the issue mentioned in reed.essick/iDQ#57, by doing the following things:
- Extend
SelectLoudest
to take in multiple gps times as well. - Implements a
DenseQuiver
that handles thevectorize()
step in bulk rather than delegating to eachFeatureVector
. - Have a check in
QuiverFactory.unlabeled()
which determines which quiver to create based on how densely populated the times requested are.
There's also a couple of things I changed. In GstlalHDF5ClassifierData
, there was an edge case in the object's initialization where I assumed self.segs
wasn't empty, and so that's handled better now. Also in sklearn.py
, I delegate to QuiverFactory.unlabeled()
rather than just calling QuiverFactory
to produce unlabeled quivers.
I'm leaving this as a work in progress because there's probably edge cases that haven't been handled properly (still doing some testing), and the exact way I went about a few things may be still moved around a bit, depending on what your thoughts are. There's also some commented code I still need to remove.
As for speed, I'm using a 20 second stride to compare speedups. Without any changes, I'm able to produce a 20 second stride of timeseries sampled at 16 Hz in about 12s. Now, that same 20 second stride sampled at 128 Hz takes about 6-7s.
Closes reed.essick/iDQ#57.