known/possible issues for streaming pipeline

Below are a list of known/possible issues with the streaming pipeline that were identified as part of !67 (merged) but not addressed.

padding new_umbrella returned by StreamProcessor.poll to avoid edge effects when delegating to FeatureVector.vectorize.
- calling restrict_segs when adding new_umbrella to the big umbrella may update shared references with feature vectors that would undo the padding, so we need to check that this is not the case.
CalibrationMap needs a concept of time, segments to keep provenance of which samples are included.
- these segments should be used with the Reporter that writes CalibrationMaps
idq-streaming_calibrate may be systematically missing some of the data from idq-streaming_evaluate because of the way DiskReporters manage their caches and preferred options
- make calibrate's stride shorter?
- muck with CadenceManager.timestamp to keep it in line with what was actually read?
- confirm there is actually a problem...
- change DiskReporter's behavior to remove this issue
  - keep a counter of which line the "preferred" file is in the cache and increment it like KafkaReporter would, returning None as appropriate if we're off the end of the file
KW and GSTLAL ClassifierData need to raise NoDataError if they can't find any files within the requested period
KW and GSTLAL ClassifierData need to raise IncompleteDataError (or just BadSpanError?) if there is only partial coverage
change the order of nested iterations when writing timeseries so that all classifiers are written to disk for a given segment before moving on to the next segment
improve provenance for all samples/inferences made. This could be as simple as making FeatureVectors store a unique identifier for the model used to evaluate them (this should be an attribute of the model, preferably searchable so that we can look-up which model was used without iterating over all models) in addition to the ranks. Then, CalibrationMap can reference FeatureVectors instead of just ranks and maintain provenance for what went in (vectors already contain gps time, but we may want to add segments to CalibrationMaps as well).
- we want to check the "checksum" of the model as well to make sure we don't have any associated issues

Edited Feb 16, 2023 by Patrick Godwin