[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] change dagman's maxjobs while dagman is running?



On Tue, 5 Oct 2010, Carsten Aulbert wrote:

Hi,

I've found an email thread from 2005 discussing this

https://www-auth.cs.wisc.edu/lists/condor-users/2005-February/msg00373.shtml

Is this possible nowadays - I have a long running dagman here which currently
runs up to 500 jobs at once, but the file servers can possibly go up to 1000
or beyond and I would like to increase this number without restarting the 120k dagman.
There's not really a "clean" way to do this, but depending on how maxjobs 
was specified, there are some things you can do.
If your maxjobs limit is specified in a per-DAG config file, you could 
edit the config file, and then do condor_hold and condor_release on the 
DAG job itself.  That would cause DAGMan to restart and go into recovery 
mode, but in the mean time, any node jobs already in the queue would 
continue.
If maxjobs is specified in the command-line arguments, I guess you could 
do something like condor_hold, then condor_qedit to change the arguments, 
and condor_release.
Finally, you can do condor_rm of the DAGMan job, and re-submit it with a 
different maxjobs setting to run the resulting rescue DAG; but that will 
remove all running node jobs, so you'll waste some work.
Kent Wenger
Condor Team