[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Copying job attrs into slot attrs



I'm inclined to think that the only way to make what people expect to happen, happen is to negotiate one slot at a time on an SMP machine and let the admin indicate that negotiation should not occur on that machine again till it spots an update on the machine indicating the start (or failure) of said job or a timeout occurs.

The alternate is that all calculation within the client machines must be pushed to the negotiator (with associated split brain issue but inevitable given that current negotiation works on the collector values not the real ones) but that all 'dynamic' values like load, pushed in via cron/hawkeye are unavailable for this purpose which strikes me as less useful in reality.

Having to manually indicate the required 'shared' state strikes me as a source of no end of bugs (plus what happens if someone specifies a 'dynamic' in this list?)

Indicating that a job doesn't care about the dynamic state of the machine is possibly a reasonle optimization switch though as it makes it a candidate for inserting onto a machine already negotiated with. (Working this out from the requirements and rank may be a bit hard).

This comes down to a wider problem though that is only getting worse.

What values are valid in what context and with what guarantees (if any) are there about the reliability of the value provided.

Random (meandering, largely unstructured) thoughts on this below.

Providing some explicit NegotiorKnown_ prefixed values available only for RANK, Requirements etc. whereby these can be made explicit might work.

Realistically I can't see a decent solution for anyone except writing their own backend used by the startd hooks (since inherently the startd has full knowledge of it's own state at that stage). Really I see, for complex systems, the only meaningful solution is to accept that you selection process itself must be distributed to have strong awareness but that then the transactional nature of this must be enforced by an external database like system where scaling is proportional to the complexity of the locking requirements (so simple 'only one machine gets to pick' is relatively cheap but people wanting to ensure that only n such jobs run at once anywhere pay a heavier cost in synchronization).

The alternate is not classads but a proper predicate filtering logic in some first class programming language which can run at the negotiator and look at general (unbounded) things like the set of other jobs already on/assigned to the machine which simply don't play nice with the class ad structure. If your systems are utterly homogenous and thus amenable to the hideous, but effective 

    (SLOT1_XX blah) && (SLOT2_xx blah) ... (SLOTN_xx blah) 

sort of thing you can write stuff that works but it is a mess to maintain and inherently unpleasant.

For the setup we currently use neither of the current 'intended' setups for negotiation work well:

 1. multiple schedd's negotiator manages them playing nice but we are forced to use machine RANK to manage the priority 
   + retirement allows us to control pre-emption at least
   - cannot dynamically allocate slots to different roles as the negotiator/startd state disconnect problem kicks in
 2. single schedd runs the whole shebang as one user
   + just use job priority to manage things (separate process makes sure the queue is always in order)
   - simply won't scale (and is a massive single point of failure)
 3. job hooks
   + total control
   - quite a lot of effort
   - still not ready for prime time (at least on windows) as far as I can tell

Really what I want is the ability to run my own negotiator that does it's own thing without explicit schedd's (thus centralising my logic, allowing efficient in memory locks as opposed to complex distributed ones). Steady state optimizations whereby I maintain the (possibly multiple) queues in priority order and thus have cost only on insertion (rather than negotiation which becomes pretty trivial and scales with notional queue (of which I would realistically have two) or notional queue/user pairs (tops 20) multiplied by slots rather than any combinatorial aspect (with the associated attempts to see how such combinations can be reduced by spotting similarities automatically as now happens).

The existing databases used as intermediaries fro working with condor queues become the actual queues removing considerable additional complexity and effort (maintaining state in two different places) and I could even start to add google like behaviour whereby some long running jobs are allowed to start more than once when there is extra capacity to improve the chance of completion in the face of pre-emption or machine failure.
As a major benefit there is absolutely no need in our system for the shadow processes, they bring no benefit (they are briefly used at start up then sit there consuming lots of contiguous RAM for their stacks and a few precious handles) in our setup.

substituting for the negotiator is simply unrealistic however :(

In my case I would have to make use of the following invariant (quite likely common to others though) to optimise this:
Force clients to make sure their 'queue' is always in absolute order - i.e. the moment one job in the queue doesn't find a home none will.
Separate streams should go on separate queues (or separate users)

/ramble

Matt

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Ian Chesal
Sent: 08 December 2009 15:19
To: Condor-Users Mail List
Subject: Re: [Condor-users] Copying job attrs into slot attrs

> Note that this will not play nice with SMP machines unless
> you make the negotiator only match one slot per machine per
> cycle as it will not deal with the empty state cleanly (it
> will happily pump 4 jobs at the machine in your above example).

Good point. We use a similar setup for jobs that want all of a machine, and we keep the special matching to just slot 1 on every machine.

Condor doesn't update the Class Ads at the negotiator, or in the other slots, fast enough for you to do any sort of reliable, syncrhonized decisions making on a machine-wide basis.

> This is a real pain for asymmetric SMP utilization, I don't
> know if the condor team has any long term plans in this area.

I'd love to hear about the roadmap, if there is one, for this area.

Perhaps the new SMP support could deal with this? Where you can advertise a custom attribute on the machine's ad that jobs can then subtract from as they use the attribute? Much like they do for memory and disk, but this would be a user-assigned and configurable collection of attributes.

I really need to look in to the new SMP support more...maybe this is in there already?

- Ian

Confidentiality Notice.
This message may contain information that is confidential or otherwise protected from disclosure. If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution,  or copying  of this message, or any attachments, is strictly prohibited.  If you have received this message in error, please advise the sender by reply e-mail, and delete the message and any attachments.  Thank you.

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/

----
Gloucester Research Limited believes the information provided herein is reliable. While every care has been taken to ensure accuracy, the information is furnished to the recipients with no warranty as to the completeness and accuracy of its contents and on condition that any errors or omissions shall not be made the basis for any claim, demand or cause for action.
The information in this email is intended only for the named recipient.  If you are not the intended recipient please notify us immediately and do not copy, distribute or take action based on this e-mail.
All messages sent to and from this email address will be logged by Gloucester Research Ltd and are subject to archival storage, monitoring, review and disclosure.
Gloucester Research Limited, 5th Floor, Whittington House, 19-30 Alfred Place, London WC1E 7EA.
Gloucester Research Limited is a company registered in England and Wales with company number 04267560.
----