Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Access to renamed DAG node names
- Date: Mon, 12 Jan 2015 11:50:52 +0000
- From: Brian Candler <b.candler@xxxxxxxxx>
- Subject: Re: [HTCondor-users] Access to renamed DAG node names
Here's a rough idea of how I'm attempting to compose DAGs.
I want to write two similar dags using the same code; the output from
one dag is used as the input to another; and the two dags are spliced
into a parent dag.
Now, I tried doing this using a separate subdirectory for each dag, and
using "SPLICE .... DIR=<directory>". However it doesn't do what I
expected. Every submit file, every job output/error file still has to be
explictly given relative to the top-level directory.
Could this perhaps be a bug? I'm using condor 8.2.4-281588
A script to demonstrate:
---- 8< ----
#!/bin/bash
write_dag() {
cat <<EOS >"foo.dag"
JOB A $2/foo.sub
VARS A job="A" input="$1" arg1="aaa" arg2="AAA" prefix="$2/"
JOB B $2/foo.sub
VARS B job="B" input="$2/A.out" arg1="bbb" arg2="BBB" prefix="$2/"
SCRIPT POST B $2/deliver.sh $2/B.out
EOS
cat <<EOS >"foo.sub"
universe = vanilla
executable = $2/foo.sh
arguments = "'\$(arg1)' '\$(arg2)'"
output = \$(prefix)\$(job).out
error = \$(prefix)\$(job).err
queue
EOS
cat <<'EOS' >"foo.sh"
#!/bin/sh
echo $1
cat
echo $2
EOS
chmod +x foo.sh
cat <<'EOS' >"deliver.sh"
#!/bin/sh
cat "$1" >>/tmp/deliver.out 2>>/tmp/deliver.err
EOS
chmod +x deliver.sh
}
echo "hello world" >data.in
mkdir dag1
( cd dag1; write_dag data.in dag1 )
mkdir dag2
( cd dag2; write_dag dag1/B.out dag2 )
cat <<EOS >dag.dag
SPLICE dag1 dag1/foo.dag DIR=dag1
SPLICE dag2 dag2/foo.dag DIR=dag2
PARENT dag1 CHILD dag2
NODE_STATUS_FILE node.status
JOBSTATE_LOG jobstate.log
EOS
# condor_submit_dag dag.dag
# Final output is in dag2/B.out
---- 8< ----
To try this, cd into an empty directory then run this script, and
"condor_submit_dag dag.dag" to confirm it works.
Note that in the function write_dag(), I have to give an explicit path
to the submit file, and in the submit file I have to give explicit paths
to the executable and the input/output/error files relative to the
parent, not to the directory containing the dag node. It seems that the
SPLICE ... DIR=xxx option is doing nothing.
Any clues gratefully received, because otherwise the separate DIR per
splice would be quite a good approach I think.
Now, the other approach I tried (and which is where this question
originated) was to write everything into a flat directory.
condor_submit_dag takes care of making the dag node names unique in the
form "parentnode+node", but you still have to ensure that every
input/output file has a unique name, so it means passing around a
parameter to do this.
---- 8< ----
#!/bin/bash
write_dag() {
cat <<EOS >"$2.dag"
JOB A foo.sub
VARS A job="A" input="$1" arg1="aaa" arg2="AAA" prefix="$2+"
JOB B foo.sub
VARS B job="B" input="$2+A.out" arg1="bbb" arg2="BBB" prefix="$2+"
SCRIPT POST B deliver.sh $2+B.out
EOS
}
cat <<EOS >foo.sub
universe = vanilla
executable = foo.sh
arguments = "'\$(arg1)' '\$(arg2)'"
output = \$(prefix)\$(job).out
error = \$(prefix)\$(job).err
queue
EOS
cat <<'EOS' >foo.sh
#!/bin/sh
echo $1
cat
echo $2
EOS
chmod +x foo.sh
cat <<'EOS' >deliver.sh
#!/bin/sh
cat "$1" >>/tmp/deliver.out
EOS
chmod +x deliver.sh
echo "hello world" >data.in
write_dag data.in dag1
write_dag dag1+B.out dag2
cat <<EOS >dag.dag
SPLICE dag1 dag1.dag
SPLICE dag2 dag2.dag
PARENT dag1 CHILD dag2
NODE_STATUS_FILE node.status
JOBSTATE_LOG jobstate.log
EOS
# condor_submit_dag dag.dag
# Final output is in dag2+B.out
---- 8< ----
This does work, and each node name you see in jobstate.log or
node.status maps directly to the *.out and *.err file generated by that
node. However I was hoping I could pick up $(prefix) from some condor
setting, rather than having to pass it round explicitly and set a VARS
value on every DAG node.
There is the $JOB macro but it's only available in SCRIPT PRE/POST,
according to the documentation, so I can't use it in a submit file.
Regards,
Brian Candler.