Hi Mark,
Thank you very much for your reply!
The
Michael's suggestion using condor_wait is sufficient for me at least for now. Its strength is that it does not require planning the whole workflow in advance and it is very simple to implement.
I have been doing it since yesterday and it worked as expected. This can even mimic running condor_wait on 2 log files at once (if I have running A and B and want to start C if both of A and B are done) by running them sequentially. I have also changed my
code to put all logs of a job cluster into the same log file per his suggestion.
Ideally, I would want to do something like this:
# on_the_fly_dag.dag
SCRIP PRE NextStep CreateNextStepSub.py
JOB NextStep NextStep.sub
PARENT @1000 CHILD NextStep # @ indicates that it is a currently running cluster number
then
condor_submit_dag
on_the_fly_dag.dag # if on_the_fly_dag.dag was not submitted yet => ID=1001
condor_submit_dag -update 1001
on_the_fly_dag.dag
# the command checks that the new dag does not contradict the dag job 1001 originally got and updates it
# in the simplest and easiest case, it would suffice to allow only addition of new vertices to the graph
This way I can keep adding to the DAG on the go, have some computing done before the whole workflow is finished, and once all the steps are coded, the on_the_fly_dag.dag can
be easily converted into the final DAG by replacing e.g. @1000 with some job names.
Thank you,
Siarhei.
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx]
On Behalf Of Mark Coatsworth
Sent: Wednesday, May 09, 2018 5:29 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] On-the-fly DAGs?
Hi Siarhei,
There are several different ways to do what you're asking for.
If Michael's suggestion using condor_wait does what you need, that's great! I think you would need to run this manually though so it's a bit prone to error.
Another option would be to use POST scripts. If you put your original job into a single-node DAG, you could write a POST script which checks a certain condition. If the condition passes, your script would write out a new DAG file and then
run condor_submit_dag on it. If the condition fails, your script exits and the on-the-fly DAG is done.
On Wed, May 9, 2018 at 10:36 AM, Michael Pelletier <Michael.V.Pelletier@xxxxxxxxxxxx> wrote:
My upcoming HTCondor Week presentation goes over a few useful tricks with the newer submit description features which reduce the need for script-generated submit descriptions. Keep an eye out for it in the proceedings, or if you're attending
the conference I'll see you there! You might also find the HTCondor Python bindings to be useful for defining and submitting jobs.
Just to be sure I'm clear, we're not talking about the Error or Output parameters, but the Log parameter in the submit description - generally you only ever want one log per cluster, since it's logging the management of the entire cluster or group of clusters
from a single submission. It doesn't contain much information that's particularly useful in the context of a single job within a 1000-job cluster.
As for using a cluster number instead of a log file, you could do a condor_wait wrapper like so:
#!/bin/bash
condor_wait $(condor_q $1 -af UserLog | head -1)
You'd give this script the job ID as the argument, and it would wait until all the jobs in the specified cluster are done, assuming the cluster defines a UserLog.
-Michael Pelletier.
-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Vaurynovich, Siarhei
Sent: Wednesday, May 9, 2018 11:17 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [External] Re: [HTCondor-users] On-the-fly DAGs?
Thank you for your reply, Michael!
That sounds like what I want. I would just prefer to not give a log file as input but instead a cluster number only, and let Condor figure out which log file to watch.
Currently, my submit files are generated programmatically and each job in a cluster gets its own log file. It seems I need to reconsider it.
Thank you,
Siarhei.
-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Michael Pelletier
Sent: Wednesday, May 09, 2018 10:20 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] On-the-fly DAGs?
Sounds like a good use for condor_wait.
When you give condor_wait a job's log file (log = htcondor-$(Cluster).log) it watches the file and will only exit when all the jobs in that log have completed.
So what you'll want to do is write a little script which runs a condor_wait on the pending job cluster and then submits your next job after condor_wait exits.
You could submit it as a "local" universe job so that the condor_wait that's sitting around doing nothing wouldn't be using a CPU slot.
-Michael Pelletier.
-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Vaurynovich, Siarhei
Sent: Tuesday, May 8, 2018 9:35 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [External] [HTCondor-users] On-the-fly DAGs?
Hello,
Could you please let me know if it is possible to create on-the-fly DAGs in HTCondor?
Here is an example: I work on some code and when it is ready I submit a number of jobs to job cluster 1000. After that I work on the next processing step and finish the needed code before the jobs in cluster 1000 are completed. I want to be able to say: start
this next set of jobs when and if all the jobs in cluster 1000 are completed successfully, i.e. I want to create an "on-the-fly" DAG. The goal is to have some computing to be done on some steps of the workflow even before the whole workflow code is ready and
keep adding to the workflow on the fly.
Thank you,
Siarhei.
............................................................................
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
............................................................................
Trading instructions sent electronically to Bernstein shall not be deemed accepted until a representative of Bernstein acknowledges receipt electronically or by telephone. Comments in this e-mail transmission and any attachments are part of a larger body of
investment analysis. For our research reports, which contain information that may be used to support investment decisions, and disclosures see our website at
www.bernsteinresearch.com.
For further important information about AllianceBernstein please click here
http://www.abglobal.com/disclaimer/email/disclaimer.html
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
--
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin-Madison
|