Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Fill nodes breadth-first
- Date: Fri, 06 Apr 2012 17:00:24 -0500
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Fill nodes breadth-first
On 4/6/2012 4:00 PM, Sarah Williams wrote:
Hi,
I was following this recipe to enable breadth-first filling of nodes on
the cluster:
https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToSteerJobs
I added this to my condor_config.local files and ran condor_reconfig:
NEGOTIATOR_POST_JOB_RANK = isUndefined(RemoteOwner) * (KFlops - SlotID)
I can see in the Negotiator log that it took effect, but it is still
filling all the slots on one host before moving to another. Any ideas why?
Hi Sarah,
Regards from Wisconsin. One possibility: do most of your submitted jobs
specify their own Rank? As explained in the HOWTO, the Rank specified
in the submit file will trump whatever NEGOTIATOR_POST_JOB_RANK says.
If you want your breadth-first rule to trump whatever your users
request, use NEGOTIATOR_PRE_JOB_RANK instead.
Another possibility: perhaps for whatever reason the machines in your
pool have a lot of small variance in the reported kflops ? I think
the above expression will breadth-first fill across machines with the
same kflops. Take a peek at the output from
condor_status -server -sort kflops
and see if the reported kflops value slightly varies every few
machines... and/or if on Unix you could do
condor_status -format "%d" kflops | sort | uniq | wc -l
to see how many different "classes" of kflops machines you have. If
large, perhaps you'd prefer something like:
NEGOTIATOR_POST_JOB_RANK = isUndefined(RemoteOwner) * (500 - SlotID)
to simply ignore the kflops value.
(I know on a pool here at UW-Madison with 1951 slots, there are 163
different kflop values reported....)
hope the above makes sense,
regards,
Todd
p.s. Extra credit: for the real Condor geeks, another approach would be
to bucket the kflops value in the NEGOTIATOR_POST_JOB_RANK expression,
so this breadth-first recipe would still work even if the reported
kflops varies by some small value like 30k or so. In Condor v7.7.6 (to
be released next week) this is a spiffy quantize() ClassAd function to
assist in this sort of bucketing, so in Condor v7.7.6 you could do:
NEGOTIATOR_POST_JOB_RANK = isUndefined(RemoteOwner) *
(quantize(kflops,{30000}) - SlotID)
Maybe I'll update this HOWTO recipe based on your feedback (or this is
open source, feel free to ask for a condor-wiki account by emailing
condor-admin@xxxxxxxxxxx, and then you could edit the recipe yourself!)...
--Sarah
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing Department of Computer Sciences
Condor Project Technical Lead 1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132 Madison, WI 53706-1685