Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Best Practices: how to handle a DAG with an unknown number of jobs in one step...
- Date: Thu, 25 Feb 2010 10:47:23 -0600 (CST)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Best Practices: how to handle a DAG with an unknown number of jobs in one step...
On Thu, 25 Feb 2010, Daniel Pittman wrote:
G'day.
We have a regular set of jobs that I am a bit puzzled about how best to
model in Condor. Specifically, the model is this:
Step 1: Fetch data from our collection system.
Step 2: Process that data into an unknown number of "packs".
Step 3: Generate one report for each pack at step 2.
Now, stage one and two are pretty easy, but I would ideally like to have
a single DAG that would encapsulate the whole process.
What gives me trouble is working out how to get step 3 to generate one
condor job for each pack — since they can run in parallel trivially,
and we generally only run one or two of the overall jobs at any given
time.[1]
I can think of two ways to approach this:
1) Use a single submit file for step 3 that submits however many jobs
you need. (This may not be possible depending on how much the jobs
have to differ in their arguments, etc., because for DAGMan all of the
jobs have to be in the same cluster.)
2) Use a nested DAG for step 3 with one node for each "pack".
Anyhow, here's the explanation of the two approaches (this is assuming
you're running a recent DAGMan (e.g., 7.4.1 or later) -- if you're using
something much older than that, this stuff will be harder to do).
For option 1, all you have to do is have the node job for the process
step (or its post script) write the submit file for the report step. In
recent versions of DAGMan, the submit file for a node job doesn't have to
exist until right before that job is actually submitted, so you can do
this. In an older version, you'll have to have a "place-holder" submit
file in existance ahead of time, and overwrite it (but you must use the
same log file). So your DAG would look like this:
Job fetch fetch.sub
Job process process.sub
Job report report.sub
Parent fetch Child process
Parent process Child report
and the process step would have to write report.sub.
For option 2, your DAG file would look like this:
Job fetch fetch.sub
Job process process.sub
Subdag External report report.dag
Parent fetch Child process
Parent process Child report
and the process job would have to write report.dag. (For this to work,
you'll have to have a dummy report.dag file in place at submit time, but
you can overwrite it. This restriction should go away soon.) Unless
you can re-use the same submit file for all "packs" using the VARS feature
(I would guess you probably can), the process job would also have to write
the submit files for the report.dag nodes.
Option 2 is a little more work, but is a more general solution.
Kent Wenger
Condor Team