Skip to content

BUGFIX: bugfixes from 1.2.1

Colm Talbot requested to merge remove-mandatory-authentication into master
  • The changes in 1.2.1 now require a scitoken for all jobs, rather than just ones that need to access proprietary data.
  • The default frame types don't cover every stretch of data.
  • No engineering runs are covered.
  • gwdatafind runs for all jobs and leaves confusing messages if datafind isn't needed.

This MR addresses the issues above.

EDIT 231108: This MR has sprawled.

Making these changes revealed lots of interlacing issues, especially in accessing data without LIGO authentication for the testing.

File-by-file changes:

  • generation-node.py:
    • only create scitokens if we are looking at proprietary CVMFS repositories of using the OSDF file transfer.
    • use gwpy find_best_frametype. This doesn't work for engineering run data, so anyone looking for that data will need to manually specify the frame type.
  • bilby_pipe_dag_creator.py:
    • when using simulated data the start_time is not set in generation_node.py.
  • node.py:
    • enable file transfer for generation jobs when analysis jobs use the osg, this will just transfer on the local cluster.
    • environment variables were being passed with quotes, e.g., "'GWDATAFIND_SERVER'" which makes them not recognized.
  • dag.py:
    • pass HTGETTOKEN environment variable to the dagman (see this issue). I think this is the most important change to make TimeSeries.get work.
  • data_generation.py:
    • move channel_dict to base Input.
    • since PSDs can be estimated for a subset of interferometers, we always set the psd options.
  • gracedb.py:
    • support reading data from GWOSC using fetch_open_data for previous observing runs.
    • remove the x509userproxy that wasn't doing anything.
    • if the PSDs aren't in the coinc.xml (as was the case in previous observing runs) more data is needed to estimate the PSD.
    • fix the flag to disable querying the low-latency kafka data stream.
  • input.py:
    • channel_dict moved from data_generation.
    • if using transfer files, look for frame files in the local directory as absolute paths in the transfer are not preserved.
    • allow a special None/"None" value for the PSD entry to indicate that it should be estimated from the data.
  • utils.py:
    • remove the function I added in 1.2.1 as we are using the more sophisticated gwpy version instead.
  • example ini files:
    • to make the datafind section work in the tests, the channels have been set to look for gwosc data. I think we should encourage people to use this data anyway when available.
    • this change is also made in the gracedb tests.

The data finding changes are described in a separate MR that adds a new documentation page.

I've created an internal repo for testing on proprietary data through condor.

Edited by Colm Talbot

Merge request reports

Loading