Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Maximum number of retries per job?
- Date: Fri, 22 Oct 2004 15:47:23 +0100
- From: Angel de Vicente <angelv@xxxxxx>
- Subject: Re: [Condor-users] Maximum number of retries per job?
Hi,
at last I'm goint to implement this, but according to the documentation
NumRestarts
: A count of the number of restarts from a checkpoint attempted by this job during its lifetime.
so, I guess if I want to just do it for vanilla jobs I should better use
JobRunCount, which does not seem to be documented in the manual, but I assume it
means how many times the job has started. I was going to try:
condor_submit -a 'periodic_remove = JobRunCount > 10 && JobUniverse == 5' $*
Any issues with this?
Thanks a lot,
Angel de Vicente
Peter F. Couvares writes:
> Angel de Vicente wrote:
> >> periodic_remove = NumRestarts > 10
> >
> > 1. could it be possible then to automatically add this to every user's
> > job
> > description?
>
> One kludgey-but-effective way to do this now is via a condor_submit
> wrapper which adds a "-a periodic_remove = NumRestarts > 10" argument
> to submits.
>
> But we're working on a SYSTEM_PERIODIC_REMOVE config expression which
> will allow the administrator to set a schedd-wide policy independent
> from that which the users set in their personal periodic_remove. It
> should be in an upcoming 6.7 series release (probably 6.7.3), but no
> promises.
>
> > 2. I wouldn't want to kill a standard universe job that has restarted
> > more than
> > 10 times. Is there a way to differentiate between restarts in the
> > vanilla
> > universe and restarts in the standard universe?
>
> The universe of a job is advertised in its "Universe" attribute. Just
> add that the the periodic_remove expression so it only becomes true for
> the job universes you want.
>
> -Peter
>
> --
> Peter Couvares University of Wisconsin-Madison
> Condor Project Research Department of Computer Sciences
> pfc@xxxxxxxxxxx 1210 W. Dayton St. Rm #4241
> (608) 265-8936 Madison, WI 53706-1685
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users
--
----------------------------------
http://www.iac.es/galeria/angelv/
PostDoc Software Support
Instituto de Astrofisica de Canarias