Skip to content

Batch round robin and segment queries

Reed Essick requested to merge batch_round_robin into master

This merge request contains a lot of stuff, which is all necessary to get the batch pipeline running with round-robin logic in place. Except for calibration (which will be updated in short order from the calibration_development branch), the batch pipeline should now work with all workflows.

I summarize the changes below:

  • etg_synchronizer

    • I appear to have picked up a bunch of changes when I rebased from master, but did not edit these files myself. Please check that I haven't goofed anything up!
  • idq.io

    • Again, I appear to have picked up some changes from master; which appears to be confined to replacing glob.glob with glob.iglob within the gstlal classifier data object. Please check this as well!
  • bin/*

    • I've added --exclude options to most of the batch executables. This is needed to support the round-robin logic within batch and we've exposed the API to the command line in case users find it useful independent of that.
    • I've mucked around with the options within idq-batch to specify a few things independently of the config file. These are limited and only have to do with the workflow and logging for the batch job itself; i.e.: the actual execution and results of train, evaluate, calibrate, and timeseries are independent of these command line options.
    • added idq-condor_batch, which is needed for batch_workflow=condor to function correctly.
  • etc/idq.ini

    • added a new section to specify how we select samples. This used to be part of [general], but now is included in a new section called [samples]
  • idq.utils

    • added segment query utility functions using SegDb's REST API
    • changes a bit of syntax (the name of check_segements)
    • guaranteed that ligolw.segmentlist is returned by segments_intersection
  • idq.batch

    • added segment queries to SegDb with the Python REST API
    • added a placeholder for causal_batch, which is suggest by #24 (closed)
    • implemented all batch workflows
    • added exclude kwarg to all batch functions, as needed.
    • modified how we select samples to reflect changes in INI format (see changes to etc/idq.ini)
  • idq.stream

    • changed how we read samples to reflect changes in INI format (see changes to etc/idq.ini)
  • idq.calibration

    • replaced the NotImplementedError with a pass statement within FixedBandwidth1DKDE.optimize. This should be quickly overwritten by changes in the calibration_development branch and allows the pipeline to run to completion.
  • idq.classifers

    • a minor change in syntax regarding how ranks are assigned to vectors
  • idq.condor

    • fixed a few typos re: passing kwargs to delegation functions
    • added functionality needed for idq-batch with batch_workflow=condor
  • idq.logs

    • mucked around with how loggers are instantiated to prevent repeated print statements
    • exposed logger path API so things in idq.batch can reference where other jobs will write their logs easily
  • idq.names

    • added support needed for idq-batch
  • setup.py

    • a bit of cleanup (repeated executables)
    • added idq-condor_batch to install list

Merge request reports