Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Condor held jobs should retry/release after certain configured timeout automatically
- Date: Wed, 8 Apr 2015 08:42:32 -0500
- From: Brian Bockelman <bbockelm@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Condor held jobs should retry/release after certain configured timeout automatically
> On Apr 8, 2015, at 8:19 AM, Ben Cotton <ben.cotton@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Apr 8, 2015 at 7:26 AM, Sridhar Thumma <deadman.den@xxxxxxxxx> wrote:
>
>> SYSTEM_PERIODIC_RELEASE=((NumSystemHolds < 5 && (time() -
>> EnteredCurrentStatus) > 30) &&
>> (HoldReason.substr("InvalidAMIID.NotFound",0)!=""))
>>
> That's not how substr is called. I'm not sure substr would be all that
> helpful here anyway.
>
>> SYSTEM_PERIODIC_RELEASE=((NumSystemHolds < 5 && (time() -
>> EnteredCurrentStatus) > 30) && regexp("^.+InvalidAMIID.+$",HoldReason))
>>
> It looks like the regexp parsing doesn't like the use of ^ and $. You
> might try dropping that. I did a similar test for sleep jobs in my
> history (version 8.3.2):
>
> -bash-3.2$ condor_history -const 'regexp("^.+sleep.+$", Cmd)' | wc -l
> 1
> -bash-3.2$ condor_history -const 'regexp("sleep", Cmd)' | wc -l
> 5146
> -bash-3.2$
>
> Since you have a held job in the queue, you can use condor_q with a
> constraint to test your SYSTEM_PERIODIC_RELEASE expression before you
> set it.
>
Nah, the regexp is fine. See:
$ python
Python 2.6.6 (r266:84292, Jan 23 2014, 10:39:35)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import classad
>>> classad.ExprTree('regexp("^.+InvalidAMIID.+$",HoldReason)').eval({'HoldReason': 'I am an InvalidAMIID!'})
True
(note in your test, the regexp requires there to be characters before and after the 'sleep' string)
Are you sure you don't have a hold/release loop?
Brian