BUGFIX: bugfixes from 1.2.1
- The changes in 1.2.1 now require a scitoken for all jobs, rather than just ones that need to access proprietary data.
- The default frame types don't cover every stretch of data.
- No engineering runs are covered.
- gwdatafind runs for all jobs and leaves confusing messages if datafind isn't needed.
This MR addresses the issues above.
EDIT 231108: This MR has sprawled.
Making these changes revealed lots of interlacing issues, especially in accessing data without LIGO authentication for the testing.
File-by-file changes:
-
generation-node.py:- only create scitokens if we are looking at proprietary CVMFS repositories of using the OSDF file transfer.
- use
gwpyfind_best_frametype. This doesn't work for engineering run data, so anyone looking for that data will need to manually specify the frame type.
-
bilby_pipe_dag_creator.py:- when using simulated data the
start_timeis not set ingeneration_node.py.
- when using simulated data the
-
node.py:- enable file transfer for generation jobs when analysis jobs use the osg, this will just transfer on the local cluster.
- environment variables were being passed with quotes, e.g., "'GWDATAFIND_SERVER'" which makes them not recognized.
-
dag.py:- pass
HTGETTOKENenvironment variable to the dagman (see this issue). I think this is the most important change to makeTimeSeries.getwork.
- pass
-
data_generation.py:- move
channel_dictto baseInput. - since PSDs can be estimated for a subset of interferometers, we always set the psd options.
- move
-
gracedb.py:- support reading data from GWOSC using
fetch_open_datafor previous observing runs. - remove the
x509userproxythat wasn't doing anything. - if the PSDs aren't in the
coinc.xml(as was the case in previous observing runs) more data is needed to estimate the PSD. - fix the flag to disable querying the low-latency kafka data stream.
- support reading data from GWOSC using
-
input.py:-
channel_dictmoved fromdata_generation. - if using transfer files, look for frame files in the local directory as absolute paths in the transfer are not preserved.
- allow a special
None/"None"value for the PSD entry to indicate that it should be estimated from the data.
-
-
utils.py:- remove the function I added in 1.2.1 as we are using the more sophisticated
gwpyversion instead.
- remove the function I added in 1.2.1 as we are using the more sophisticated
- example ini files:
- to make the datafind section work in the tests, the channels have been set to look for gwosc data. I think we should encourage people to use this data anyway when available.
- this change is also made in the gracedb tests.
The data finding changes are described in a separate MR that adds a new documentation page.
I've created an internal repo for testing on proprietary data through condor.
Edited by Colm Talbot