Thanks Ian for the prompt and clear reply. Few more questions: 1.
Is there an easy way to control and set the EUP, per each user, prior to the negotiation cycle ? for example, I would like
to ensure that all users have the same EUP. 2.
How can the negotiation cycle be disabled from being activated automatically (probably in ../
etc/condor_config file) and how can it be triggered manually at the command line. Thanks, Yuval. -- Yuval Leader Design Automation Engineer, Mellanox Technologies mailto: leader@xxxxxxxxxxxx Tel: +972-74-7236360 Fax: +972-4-9593245 Beit Mellanox. 6th Floor,R-620 P.O.Box 586, Yokneam Industrial Park, 20692 Israel From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]
On Behalf Of Ian Chesal Hi Yuval, On Wednesday, 22 August, 2012 at 7:35 AM, Yuval Leader wrote:
Add another assumption and yes, this is what you'll see. The other assumption you need to add is that all four users have the exact same effective user priority. See:
http://research.cs.wisc.edu/condor/manual/v7.6/3_4User_Priorities.html#25902 If they all have the same EUP, they'll all get exactly 1/4 of the system after one negotiation cycle assuming everything about their jobs is equal. This is easy enough to test. I queued up 10 sleep jobs from four users in a new pool that has four slots available in it. None of these users had accumulated any use history so all had identical EUPs of 0. Before I queued up the jobs, I shut down the negotiator with: condor_off -negotiator You can see the jobs ready to go: -bash-3.2# condor_status -submitter Name Machine Running IdleJobs HeldJobs alice@.internal domU-12-31 0 10 0 bob@.internal domU-12-31 0 10 0 eve@.internal domU-12-31 0 10 0 test.user@.internal domU-12-31 0 10 0 RunningJobs IdleJobs HeldJobs alice@.internal 0 10 0 bob@.internal 0 10 0 eve@.internal 0 10 0 test.user@.internal 0 10 0 Total 0 40 0 I turned on the negotiator for one negotiation cycle and got one job from each user assigned to each of the four slots in my pool: -bash-3.2# condor_q -const 'jobstatus == 2' -- Submitter: Q1@domU-12-31-38-04-9C-A1 : <10.220.159.79:59831> : domU-12-31-38-04-9C-A1.compute-1.internal ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 2.0 test.user 8/22 11:12 0+00:05:44 R 0 0.0 sleeper.py --min=6 3.0 alice 8/22 11:14 0+00:05:47 R 0 0.0 sleeper.py --min=6 4.0 bob 8/22 11:14 0+00:05:47 R 0 0.0 sleeper.py --min=6 5.0 eve 8/22 11:14 0+00:05:45 R 0 0.0 sleeper.py --min=6 08/22/12 11:15:23 ---------- Started Negotiation Cycle ---------- 08/22/12 11:15:23 Phase 1: Obtaining ads from collector ... 08/22/12 11:15:23 Getting all public ads ... 08/22/12 11:15:24 Sorting 17 ads ... 08/22/12 11:15:24 Getting startd private ads ... 08/22/12 11:15:24 Got ads: 17 public and 4 private 08/22/12 11:15:24 Public ads include 4 submitter, 4 startd 08/22/12 11:15:24 Phase 2: Performing accounting ... 08/22/12 11:15:24 Phase 3: Sorting submitter ads by priority ... 08/22/12 11:15:24 Phase 4.1: Negotiating with schedds ... 08/22/12 11:15:24 Negotiating with
alice@.internal at <10.220.159.79:59831> 08/22/12 11:15:24 0 seconds so far 08/22/12 11:15:24 Request 00003.00000: 08/22/12 11:15:24 Matched 3.0
alice@.internal <10.220.159.79:59831> preempting none <10.123.7.99:57106> ip-10-123-7-99.ec2.internal 08/22/12 11:15:24 Successfully matched with ip-10-123-7-99.ec2.internal 08/22/12 11:15:24 Request 00003.00001: 08/22/12 11:15:24 Rejected 3.1
alice@.internal <10.220.159.79:59831>: fair share exceeded 08/22/12 11:15:24 Got NO_MORE_JOBS; done negotiating 08/22/12 11:15:24 Negotiating with
bob@.internal at <10.220.159.79:59831> 08/22/12 11:15:24 0 seconds so far 08/22/12 11:15:24 Request 00004.00000: 08/22/12 11:15:24 Matched 4.0
bob@.internal <10.220.159.79:59831> preempting none <10.93.21.85:53716> ip-10-93-21-85.ec2.internal 08/22/12 11:15:24 Successfully matched with ip-10-93-21-85.ec2.internal 08/22/12 11:15:24 Request 00004.00001: 08/22/12 11:15:24 Rejected 4.1
bob@.internal <10.220.159.79:59831>: fair share exceeded 08/22/12 11:15:25 Got NO_MORE_JOBS; done negotiating 08/22/12 11:15:25 Negotiating with
eve@.internal at <10.220.159.79:59831> 08/22/12 11:15:25 0 seconds so far 08/22/12 11:15:25 Request 00005.00000: 08/22/12 11:15:25 Matched 5.0
eve@.internal <10.220.159.79:59831> preempting none <10.127.163.251:50135> ip-10-127-163-251.ec2.internal 08/22/12 11:15:25 Successfully matched with ip-10-127-163-251.ec2.internal 08/22/12 11:15:25 Request 00005.00001: 08/22/12 11:15:25 Rejected 5.1
eve@.internal <10.220.159.79:59831>: fair share exceeded 08/22/12 11:15:25 Got NO_MORE_JOBS; done negotiating 08/22/12 11:15:25 Negotiating with
test.user@.internal at <10.220.159.79:59831> 08/22/12 11:15:25 0 seconds so far 08/22/12 11:15:25 Request 00002.00000: 08/22/12 11:15:25 Matched 2.0
test.user@.internal <10.220.159.79:59831> preempting none <10.220.109.195:45947> domU-12-31-38-04-6E-39.compute-1.internal 08/22/12 11:15:25 Successfully matched with domU-12-31-38-04-6E-39.compute-1.internal 08/22/12 11:15:25 Reached submitter resource limit: 1.000000 ... stopping 08/22/12 11:15:25 negotiateWithGroup resources used scheddAds length 4 08/22/12 11:15:25 ---------- Finished Negotiation Cycle ---------- Condor determines the fairshare allotments at the outset of the negotiation cycle so it stopped after each user got one machine -- their fair share.
Yes. This is what will happen. Again, assuming their EUPs are all equal.
No, that's not what happens. The negotiator determines up front, using the EUP of each submitter, what each submitter's fair share of the machines should be for this negotiation cycle. And based on that it moves through each submitter's
list of idle jobs and tries to match them to slots. If the EUPs of your users aren't all identical then the allocations will not be equal. Some users will get more because they've used less in the recent past. Some users will get less because they've used more in the recent past.
Only if you also add the assumption that all EUPs are identical for the users.
Accounting groups help ensure that, regardless of EUP, people get some minimum (and possibly maximum) number of slots in your pool when they have jobs in a queue. If you wanted each user to always get 50 machines, but user >50 machines if other users aren't using their machines, you'd setup soft quotas for 4 different groups and put each user in a unique group. Now, Condor will attempt to fulfill
their quotas first and, once all the quotas have been satisfied, it'll let excess free resources be used, fair share, but anyone who has a soft quota limit. Regards, - Ian --- Ian Chesal Cycle Computing, LLC Leader in Open Compute Solutions for Clouds, Servers, and Desktops Enterprise Condor Support and Management Tools |