[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Schedd communication



Erik Paulson wrote:

It is a design decision to have separation of the update of the collector
and the request for the new negotiation cycle. It is not necessarily a
design decision for them to be out of order, but it is something that we
allow - the request for a reschedule is meant more as a hint. Occasionally
it can get there first, or sometimes not at all, but the system will
still eventually see the correct information, just maybe not until the
next negotiation cycle.

-Erik


Thank you. This works smoothly for "pure" Condor
(i.e., resources on which jobs are started by
condor_startd) but for Condor-G it may create
problems. This is because:

 1. For pure Condor, there is always a
    "resource claiming" step, so if the
    resource classad used by the negotiator
    is out-of-date, schedd will not claim
    the resource.

 2. For Condor-G, schedd submits the matched job
    (by way of condor_gridmanager) to the Globus
    resource and there is no true resource claiming,
    e.g., even if all the free CPUs on the Globus
    resource have been claimed by another job
    (between the time condor_advertise sent the
    classad and the time condor_gridmanager
    submitted the job), submission will succeed and
    the job will be pending in a queue at the resource.

The problem in case 2 is that, because matching
may occur on the next negotiation cycle, one has
to make the negotiation interval short, so that
the likelihood of matching with out-of-date
resource status is small. However, there is a lower
limit on how short the negotiation interval can be:
it must be at least as long as the longest
negotiation cycle, which in turn increases with
the number of jobs to be matched.

Now, if we assume that the Globus resource only
receives jobs from a single Condor-G machine,
the resource status is under the control of a
single schedd and negotiator. In this case,
missing a negotiation cycle is not harmful.
But the lack of resource claiming can cause
resource overloading even in this case, as follows:

1. In negotiation cycle 1, all the free
   resources of a Globus machine are matched
   to jobs at time t1. The negotiator sets
   CurMatches = 1 (assume one job is matched)

2. Schedd submits the jobs to the resource
   (via condor_gridmanager) at time t2

3. Jobs are created on the resource at time t3

4. condor_advertise sends a resource classad to
   the collector at time t, where t1 < t < t3
   (so this classad does not reflect the
   matched jobs). This classad is received by
   the collector at time t4

5. Negotiation cycle 2 occurs at time t1 + T,
   where t4 < t1 + T (so the negotiator sees a
   new resource classad and resets CurMatches).
   Assuming that there are still idle jobs requesting
   these resources, the negotiator will match the
   resources, without noticing that these resources
   are already matched.

6. The newly matched jobs will be submitted to the
   same Globus resource, but will be pending until
   the previously submitted jobs complete.

This could be avoided if there were a mechanism
to instruct the negotiator not to match a
resource classad until this classad has a
certain value of a certain attribute.

Currently, the negotiator supports the resource
Classad attribute CurMatches, which allows to
disable matchmaking until the next resource
classad is received. However, the condition
"next resource classad received" is not strong
enough, as shown above.

Now, if the negotiator would provide support
for building a protocol like this:

  1. when the negotiator matches a resource
     it creates a job classad attribute like

ID = negotiator_url-resource_url-sequence_number

  2. the negotiator keeps track of the largest
     sequence_number for a given resource_url

  3. this ID attribute is inserted in all jobs
     matched in this negotiation cycle with the
     given resource

  4. the negotiator will not match that resource
     again until either the resource classad contains
     a certain attribute with the value of ID or
     until the resource classad sets another constraint
     attribute, e.g.,

Ignore_sequence_number = True

then scheduling for Grid jobs would become
more robust. That would involve the negotiator
supporting a new constraint attribute and
keeping track of an attribute it generates.


Thank you. Gabriel