Erik Paulson wrote:
I checked the code. Backfill does count as a claim for this purpose, so your idea should work.On Fri, Jun 22, 2007 at 02:39:10PM +0300, Mark Silberstein wrote:The only problem was to kill startd when the actual program termintates. So at the moment we make the program that is started by startd kill startd right before the termination. But it's awkward. If we had a parameter in startd, which would trigger it to suicide when backfill executable dies itself - this would be fantastic. We have no problems with fixing that ourselves, but we thought maybe this parameter can be added in 6.9.x series. Any other ideas would be appreciated!You could try STARTD_NOCLAIM_SHUTDOWN, which is the number of seconds the startd will stay unclaimed before shutting itself down. It may workwith "Backfill" jobs as the "Claim", but I'm not sure if it does (or, actually, even if it should! :)
Another idea is to use the new DAEMON_SHUTDOWN expression, but you might have to wait for 6.9.4. In 6.9.4, there will be attributes in the startd ad advertising how much time it has spent in each state+activity, so you would be able to configure the shutdown expression to stop the startd if some backfill time has been used, but the slot is no longer in the backfill state.
Another idea is to use MaxJobRetirementTime or condor_off -peaceful to let the startd run until the job boundry. Again, I'm not sure that the startd treats the "backfill" as a job, so it may not work either.
I checked the code for this too. MaxJobRetirementTime doesn't apply to backfill, so this one won't work.
--Dan