Node input files are not translated from absolute path to relative path
When using transfer_input_files
with HTCondor, files with absolute paths are copied into the base directory of the job sandbox; however ezdag
doesn't translate the absolute path used in transfer_input_files
to a relative path for the job arguments.
Consider the following example:
from ezdag import (
Argument,
DAG,
Layer,
Node,
)
dag = DAG()
common_options = {
"request_cpus": 1,
"request_memory": 2000,
}
inputs = [
"osdf://igwn/ligo/README",
]
touch_layer = Layer(
"/bin/head",
name="head",
submit_description=common_options,
)
for i, osdf in enumerate(inputs):
file = f"file_{i}.txt"
touch_layer += Node(
inputs=Argument("infile", [osdf]),
)
dag.attach(touch_layer)
dag.write_dag("example.dag")
This results in the following DAG:
# BEGIN META
# END META
# BEGIN NODES AND EDGES
JOB head:00000 head.sub
VARS head:00000 nodename="head:00000" infile="osdf://igwn/ligo/README" input_infile="osdf://igwn/ligo/README"
RETRY head:00000 3
# END NODES AND EDGES
and the following head.sub
file:
universe = vanilla
executable = /bin/head
arguments = $(infile)
request_cpus = 1
request_memory = 2000
should_transfer_files = YES
when_to_transfer_output = ON_SUCCESS
success_exit_code = 0
preserve_relative_paths = True
transfer_input_files = $(input_infile)
output = logs/$(nodename)-$(cluster)-$(process).out
error = logs/$(nodename)-$(cluster)-$(process).err
notification = never
queue
The $(infile)
argument currently matches the $(input_infile)
argument, which means that the job ignores the file that was transferred by HTCondor, and attempts to use the URL passed to transfer_input_files
.
What might be a simple solution is to use the $BASENAME
function macro to translate absolute paths dynamically, e.g:
arguments = $BASENAME(infile)
However, this only works out-of-the-box for the case where $(infile)
represents a single URL.