WIP: kw io optimization (!15) · Merge requests · lscsoft / iDQ

Reed Essick requested to merge kw-io-optimization into master Apr 19, 2018

after implementing the KW predictive ClassifierData objects, I ran several timing tests. The results are presented below, but it does not look like this will be able to improve our I/O speed by even a factor of 2. Note, the results of these timing tests are representative when the filesystem has cached the files (they were read recently). I have not checked how things scale when the filesystem has not cached the files, but it will almost certainly be slower across the board (the new objects may be less slow, though).

While reading 1804 channels (all available channels) from 1000 seconds of data: (1186963840, 1186964840)

PredictiveKWMClassifierData: 24.850 +/- 0.719
KWMClassifierData: 26.064 +/- 1.693
PredictiveKWSClassifierData: 25.514 +/- 1.674
KWSClassifierData: 24.610 +/- 0.721

While reading 1 channel from 1000 seconds of data: (1186963840, 1186964840)

PredictiveKWMClassifierData: 11.256 +/- 0.477 sec
KWMClassifierData: 16.618 +/- 5.055
PredictiveKWSClassifierData: 0.001 +/- something less than 0.001
KWSClassifierData: 0.005 +/- 0.001

It looks like this will not be a large gain and we should look elsewhere. In particular, we should test

running the full idq-train pipeline with these ClassifierData objects and timing that
changing how FeatureVectors are instantiated so they each get a much smaller ClassifierData object instead of one big one
look at optimizations within FeatureVector.vectorize, like relying on sorted triggers and memoization.

WIP: kw io optimization

Merge request reports