Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] In Dynamic slot to match a new job to slot1.2 need condor_negotiator
- Date: Sun, 11 Jan 2009 15:01:55 +0530
- From: "Sateesh Potturu" <sateeshpnv@xxxxxxxxx>
- Subject: Re: [Condor-users] In Dynamic slot to match a new job to slot1.2 need condor_negotiator
Hi Matt,
That helped. Now it is behaving like you said; once every negotiation cycle.
Thanks,
Sateesh
On Sat, Jan 10, 2009 at 9:53 PM, Matthew Farrellee <matt@xxxxxxxxxx> wrote:
> Is there a combination of these configs that make it work for you?
>
> NEGOTIATOR_IGNORE_USER_PRIORITIES = TRUE
> NEGOTIATOR_MATCHLIST_CACHING = FALSE
>
> If there was a requirements mismatch does condor_q -better-analyze give
> any hints?
>
> Best,
>
>
> matt
>
> Sateesh Potturu wrote:
>> Hi Matt,
>>
>> But why is this job not getting started with "no match found" reported
>> in negotiator log?
>>
>> I too tested this feature and face the same problem even though there
>> were a lot of negotiation cycles.
>>
>> 1/9 21:56:27 ---------- Started Negotiation Cycle ----------
>> 1/9 21:56:27 Phase 1: Obtaining ads from collector ...
>> 1/9 21:56:27 Getting all public ads ...
>> 1/9 21:56:27 Trying to query collector <192.168.2.100:9618>
>> 1/9 21:56:27 Sorting 5 ads ...
>> 1/9 21:56:27 Getting startd private ads ...
>> 1/9 21:56:27 Trying to query collector <192.168.2.100:9618>
>> 1/9 21:56:27 Got ads: 5 public and 1 private
>> 1/9 21:56:27 Public ads include 1 submitter, 1 startd
>> 1/9 21:56:27 Entering compute_significant_attrs()
>> 1/9 21:56:27 Leaving compute_significant_attrs() - result=JobUniverse,LastCheckp
>> 1/9 21:56:27 Phase 2: Performing accounting ...
>> 1/9 21:56:27 Phase 3: Sorting submitter ads by priority ...
>> 1/9 21:56:27 Phase 4.1: Negotiating with schedds ...
>> 1/9 21:56:27 NumStartdAds = 1
>> 1/9 21:56:27 NormalFactor = 1.000000
>> 1/9 21:56:27 MaxPrioValue = 0.557410
>> 1/9 21:56:27 NumScheddAds = 1
>> 1/9 21:56:27 Negotiating with sateesh@xxxx at <192.168.2.100:3538
>> 1/9 21:56:27 0 seconds so far
>> 1/9 21:56:27 Calculating schedd limit with the following parameters
>> 1/9 21:56:27 ScheddPrio = 0.557410
>> 1/9 21:56:27 ScheddPrioFactor = 1.000000
>> 1/9 21:56:27 scheddShare = 0.000000
>> 1/9 21:56:27 scheddAbsShare = 1.000000
>> 1/9 21:56:27 ScheddUsage = 3
>> 1/9 21:56:27 scheddLimit = 0
>> 1/9 21:56:27 userprioCrumbs = 0 (0)
>> 1/9 21:56:27 MaxscheddLimit = 0
>> 1/9 21:56:27 Socket to <192.168.2.100:35388> already in cache, reusing
>> 1/9 21:56:27 Over submitter resource limit (0) ... only consider startd rank
>> 1/9 21:56:27 Sending SEND_JOB_INFO/eom
>> 1/9 21:56:27 Getting reply from schedd ...
>> 1/9 21:56:27 Got JOB_INFO command; getting classad/eom
>> 1/9 21:56:27 Request 00129.00000:
>> 1/9 21:56:27 Rejected 129.0 sateesh@xxxx <192.168.2.100:35388
>> 1/9 21:56:27 Sending SEND_JOB_INFO/eom
>> 1/9 21:56:27 Getting reply from schedd ...
>> 1/9 21:56:27 Got NO_MORE_JOBS; done negotiating
>> 1/9 21:56:27 This schedd hit its scheddlimit.
>> 1/9 21:56:27 ---------- Finished Negotiation Cycle ----------
>>
>> --
>> Regards,
>> Sateesh
>>
>> On Fri, Jan 9, 2009 at 7:48 PM, Matthew Farrellee <matt@xxxxxxxxxx> wrote:
>>> Johnson koil Raj wrote:
>>>> Hi,
>>>>
>>>> I am using condor 7.2.0, and configured system for Dynamic slot.
>>>>
>>>> when I submit 2 job at if the status shows Slot1@xxx it match only one
>>>> job to Slot1.1@xxx and for second job says
>>>> 1 match but reject the job for unknown reasons
>>>> and negotiator log says following
>>>>
>>>> 1/9 19:03:06 Socket to <192.168.111.5:9661> already in cache, reusing
>>>> 1/9 19:03:06 Over submitter resource limit (0) ... only consider
>>>> startd ranks
>>>> 1/9 19:03:06 Sending SEND_JOB_INFO/eom
>>>> 1/9 19:03:06 Getting reply from schedd ...
>>>> 1/9 19:03:06 Got JOB_INFO command; getting classad/eom
>>>> 1/9 19:03:06 Request 00053.00000:
>>>> 1/9 19:03:06 Concurrency Limit: ccp is 3.000000
>>>> 1/9 19:03:06 Rejected 53.0 idealgrid@xxxxxxxxxxxxxxxxx
>>>> <192.168.111.5:9661>: no match found
>>>> 1/9 19:03:06 Sending SEND_JOB_INFO/eom
>>>> 1/9 19:03:06 Getting reply from schedd ...
>>>> 1/9 19:03:06 Got NO_MORE_JOBS; done negotiating
>>>> 1/9 19:03:06 This schedd hit its scheddlimit.
>>>> 1/9 19:03:06 ---------- Finished Negotiation Cycle ----------
>>>>
>>>>
>>>> After restarting the negotiator the second job perfectly matches and get
>>>> executed in a Slot1.2@xxx machine that time the negotiator log says
>>>>
>>>> 1/9 19:12:05 Socket to <192.168.111.5:9661> not in cache, creating one
>>>> 1/9 19:12:05 SocketCache: Found unused slot 0
>>>> 1/9 19:12:05 Sending SEND_JOB_INFO/eom
>>>> 1/9 19:12:05 Getting reply from schedd ...
>>>> 1/9 19:12:05 Got JOB_INFO command; getting classad/eom
>>>> 1/9 19:12:05 Request 00053.00000:
>>>> 1/9 19:12:05 Concurrency Limit: ccp is 3.000000
>>>> 1/9 19:12:05 Connecting to startd slot1@xxx at
>>>> <192.168.111.200:9619>
>>>> 1/9 19:12:05 File descriptor limits: max 1024, safe 820
>>>> 1/9 19:12:05 Sending PERMISSION, claim id, startdAd to schedd
>>>> 1/9 19:12:05 Matched 53.0 idealgrid@xxxxxxxxxxxxxxxxx
>>>> <192.168.111.5:9661> preempting none <192.168.111.200:9619> slot1@xxx
>>>>
>>>> Why I Negotiator restart required to match the second Job, Help me in
>>>> this..
>>>>
>>>> by
>>>> Johnson
>>> It's not required. Slot1 is only split once per negotiation cycle. So
>>> you'll get 1.1 after 1 cycle and 1.2 after a second. Your restart just
>>> forced a second cycle.
>>>
>>> Best,
>>>
>>>
>>> matt
>>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>
--
Regards,
Sateesh