Hi Cole
that's interesting. I will think if changing give us such a
simplification
which is worth pursuing.
Stefano
Hi Stefano,
Legacy code that functions is always an interesting time, but a job can submit another job. In fact, that is exactly what DAGMan does. DAGMan will 'run' condor_submit for all nodes job template files and actually run condor_submit_dag for subdag nodes. The original job sticking around to do monitoring and bookkeeping also kind of deters the thought of switching to submission rather than direct execution since there are limits to the number of running scheduler universe jobs. Meaning you could only run half as many workflows as you would have 2 scheduler universe jobs per workflow (setup job and DAGMan job).
-Cole Bollig
From: Stefano Belforte <stefano.belforte@xxxxxxx>
Sent: Friday, June 20, 2025 2:22 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: stefano.belforte@xxxxxxx <stefano.belforte@xxxxxxx>; Cole Bollig <cabollig@xxxxxxxx>
Subject: Re: [HTCondor-users] Halting a DagmanThanks Todd and Cole,
Indeed I had a "memory" of having read about <DAG file>.halt
but could not find it in the doc today, so I asked.
I have to find a way to create that file remotely, since action will
be initiated on a machine other than the AP. I guess I can't
"spool" files there anymore.
Our use case is the usual "very old thing which was never touched
since Brian wrote it 10+ years ago maybe because at the time it
was the only/best way and we haven't tried to improve".We submit remotely a condor job on the scheduler universe in the AP,
this job does some needed intializations (like unpacking of tarball, preparing
directories, reporting things back to submitter and do some env. configuration)
and then unleashes condor_dagman as last line. So eventually
Dagman does run on the AP !
I do not know if we could use condor_submit_dagman instead at that point,
can a job submit another job ? The fact that the initial job does not exit,
but keeps running executing condor_dagman does help bookeeping and
monitoring, putting hands in that makes changes grow too much and too fast
for the current "it works, don't fix it" situation.
Thanks
Stefano