Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Dagman with a variable number of jobs
- Date: Tue, 01 Jun 2010 23:17:14 -0400
- From: Robert Sandilands <rsandila@xxxxxxxxxxxxx>
- Subject: [Condor-users] Dagman with a variable number of jobs
I am just fishing for some ideas on how to better handle the following
scenario:
Job 1 collects a list of data objects to process from a database. This
list can be of variable size and this size is unknown until the first
job finishes.
Job 2 .. N then processes the data objects in batches of up to 1,000
items per job and updates the database.
As additional fun this needs to run once a day or on a continuous basis
depending on the specific data objects.
My current attempt to solve this involves running a script that submits
a dag (right terminology?) and waits for it to finish. It then will
sleep for the required amount of time and resubmit the dag. Inside the
.dag file I use PRE scripts to determine which individual jobs needs to
be submitted and which not.
This works fine if there is a reasonable upper limit to the number of
data objects. The number of items in the list can be anything from 1,000
to 1,000,000.
If we assume that no single job should process more than 1,000 items
then it implies that there can be between 2 and 1,001 jobs in the dag.
Is it even possible to write a dag with that number of dependencies
especially since there is only 1 parent? I have tested up to 51 jobs and
that seems to run without any issues.
And what do you do if the list suddenly grows to have 1,000,001 entries?
Any ideas would be appreciated.
Robert