[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Access to renamed DAG node names



On 12/01/2015 15:42, R. Kent Wenger wrote:
Yes, this is pretty easy to do:

In your DAG file:
  Job NodeA A.sub
  Vars NodeA nodename=$(JOB)

In A.sub:
  output = $(nodename).out
  error = $(nodename).err

$(nodename) will expand to the modified node name.
Thank you. I missed $(JOB) in "DAGMan Applications" - probably because this special macro can only be expanded in a VARS statement.

Incidentally, quotes are needed:
VARS NodeA nodename="$(JOB)"

> you're saying that for failed nodes you'd like to see the values of output and error put directly into the JOBSTATE.LOG or NODE_STATUS files, so you could find them without having to look at the submit file?

Not quite; I'm saying that if a particular node fails, I want to be able to read and display its associated output and error files (without having to parse the DAG file to locate the submit file, then parse the submit file, then expand all the nested macros and VARS)

I hadn't thought of putting output/error into the JOBSTATE.LOG or NODE_STATUS files; I am just planning to stick to a convention that I use <nodename>.out and <nodename>.err as the names of the files, so that they are trivial to find given the nodename.

My difficulty was in finding a way to set the input and output attributes in the submit file to the right nodename, in the presence of dagman's node name munging - and you've given me the answer to that now.

Thanks again,

Brian.

P.S. For completeness, here's the final script for two DAGs, with all files in the top-level directory.

---- 8< ----
#!/bin/bash

write_dag() {
  cat <<EOS >"$2.dag"
JOB A foo.sub
VARS A job="A" input="$1" arg1="aaa" arg2="AAA" nodename="\$(JOB)"
JOB B foo.sub
VARS B job="B" input="$2+A.out" arg1="bbb" arg2="BBB" nodename="\$(JOB)"
SCRIPT POST B deliver.sh \$JOB
EOS
}


cat <<EOS >foo.sub
universe = vanilla
executable = foo.sh
arguments = "'\$(arg1)' '\$(arg2)'"
output = \$(nodename).out
error = \$(nodename).err
queue
EOS

cat <<'EOS' >foo.sh
#!/bin/sh
echo $1
cat
echo $2
EOS
chmod +x foo.sh

cat <<'EOS' >deliver.sh
#!/bin/sh -x
cat "$1.out" >>/tmp/deliver.out 2>>/tmp/deliver.err
EOS
chmod +x deliver.sh

echo "hello world" >data.in

write_dag data.in dag1
write_dag dag1+B.out dag2

cat <<EOS >dag.dag
SPLICE dag1 dag1.dag
SPLICE dag2 dag2.dag
PARENT dag1 CHILD dag2
NODE_STATUS_FILE node.status
JOBSTATE_LOG jobstate.log
EOS

# condor_submit_dag dag.dag
# Final output is in dag2+B.out
---- 8< ----