Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Condor held jobs should retry/release after certain configured timeout automatically
- Date: Wed, 8 Apr 2015 09:19:15 -0400
- From: Ben Cotton <ben.cotton@xxxxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Condor held jobs should retry/release after certain configured timeout automatically
On Wed, Apr 8, 2015 at 7:26 AM, Sridhar Thumma <deadman.den@xxxxxxxxx> wrote:
> SYSTEM_PERIODIC_RELEASE=((NumSystemHolds < 5 && (time() -
> EnteredCurrentStatus) > 30) &&
> (HoldReason.substr("InvalidAMIID.NotFound",0)!=""))
>
That's not how substr is called. I'm not sure substr would be all that
helpful here anyway.
> SYSTEM_PERIODIC_RELEASE=((NumSystemHolds < 5 && (time() -
> EnteredCurrentStatus) > 30) && regexp("^.+InvalidAMIID.+$",HoldReason))
>
It looks like the regexp parsing doesn't like the use of ^ and $. You
might try dropping that. I did a similar test for sleep jobs in my
history (version 8.3.2):
-bash-3.2$ condor_history -const 'regexp("^.+sleep.+$", Cmd)' | wc -l
1
-bash-3.2$ condor_history -const 'regexp("sleep", Cmd)' | wc -l
5146
-bash-3.2$
Since you have a held job in the queue, you can use condor_q with a
constraint to test your SYSTEM_PERIODIC_RELEASE expression before you
set it.
Thanks,
BC
--
Ben Cotton
main: 888.292.5320
Cycle Computing
Better Answers. Faster.
http://www.cyclecomputing.com
twitter: @cyclecomputing