OSDF support for lalpulsar_MakeSFTDAG
The following discussion from !2359 (merged) should be addressed:
- 
@david-keitel started a discussion: (+3 comments) One more question @evan-goetz : I see the transfer_input_files = $(cachefile), but is there any check that the actual input frame files will be accessible for the worker node (CVMFS/OSDF)? Probably we can't predict that in any reasonably confident way, and just have to hope that the error message from each node (both in the hardware and DAG sense of that word, now😎 ) will be clear enough?
Reply from @evan-goetz:
My understanding is that
/archive,/ceph, etc. will still be visible on LDG clusters execute points. Currently and with this MR,lalpulsar_MakeSFTDAGdoes not support running on the OSG or accessing data that is only available on CVMFS/OSDF. I'm not sure at this point what it would take to make it so but I think it would be possible. With this MR, we at least may have a pathway to do that in the future, but that would be beyond the scope of this change.
Reply from @duncanmmacleod :
Yes,
/archive,/cephetc are currently planned to be available on the execute points at Caltech (those mounts do not exist in general anywhere else).My understanding of best practice is as follows:
- query gwdatafind as part of the DAG-generation process (not as the first job in the DAG, it's overkill)
- specify
urltype="osdf"when querying gwdatafind to getosdf://...URLs that work with HTCondor file transfer- write the cache file for the job to contain only the name of each file (URL), without any paths
- include each OSDF URL in the
transfer_input_filesline for the job so that HTCondor manages the transfer to the EP into the job's working directoryThis may require configuring a tokens for the job with the appropriate
read:/{kagra,ligo,virgo}scope.