We use group accounting and you can see in the negotiator D_FULLDEBUG output
below there are two lines I've inserted the word "HERE" in. Where the first
HERE is, I'm expecting it to be saying that group_MCprod is over quota so
it's skipping it but instead it is saying that the usage is 0. It goes ahead
and negotiates with group_MCprod then even though at the second HERE you can
see it knows that it's using 3591 slots and the quota is 520. The
condor_user_prio command at the bottom also shows the slots being used. Near
the bottom of the debug output there is also a line with
matchmakingAlgorithm: in it again saying the usage is 0.
I've been fighting with this for a long time. Occasionally one of our groups
will manage to suck up all our slots even though they're over quota. Most of
the time they appear to work.
Any seen this before?
Thanks,
joe
03/22 11:56:06 group group_italy dynamic quota for 11106 slots = 188.000
03/22 11:56:06 Group Table : group group_italy quota 188.000 usage 115.000
prio 61.17
03/22 11:56:06 group group_japan dynamic quota for 11106 slots = 233.000
03/22 11:56:06 Group Table : group group_japan quota 233.000 usage 0.000 prio
0.00
03/22 11:56:06 group group_karlsruhe dynamic quota for 11106 slots = 55.000
03/22 11:56:06 Group Table : group group_karlsruhe quota 55.000 usage 0.000
prio 0.00
03/22 11:56:06 group group_mit dynamic quota for 11106 slots = 33.000
03/22 11:56:06 Group Table : group group_mit quota 33.000 usage 0.000 prio
0.00
03/22 11:56:06 group group_physmon dynamic quota for 11106 slots = 11.000
03/22 11:56:06 Group Table : group group_physmon quota 11.000 usage 0.000
prio 0.00
03/22 11:56:06 group group_prd dynamic quota for 11106 slots = 815.000
03/22 11:56:06 Group Table : group group_prd quota 815.000 usage 299.000 prio
36.69
03/22 11:56:06 group group_sam dynamic quota for 11106 slots = 277.000
03/22 11:56:06 Group Table : group group_sam quota 277.000 usage 0.000 prio
0.00
03/22 11:56:06 group group_fixedwntest dynamic quota for 11106 slots = 55.000
03/22 11:56:06 Group Table : group group_fixedwntest quota 55.000 usage 0.000
prio 0.00
03/22 11:56:06 group group_fnal dynamic quota for 11106 slots = 233.000
03/22 11:56:06 Group Table : group group_fnal quota 233.000 usage 173.000
prio 74.25
03/22 11:56:06 group group_highprio dynamic quota for 11106 slots = 888.000
03/22 11:56:06 Group Table : group group_highprio quota 888.000 usage 147.000
prio 16.55
03/22 11:56:06 group group_ntp dynamic quota for 11106 slots = 916.000
03/22 11:56:06 Group Table : group group_ntp quota 916.000 usage 567.000 prio
61.90
03/22 11:56:06 group group_mcprod dynamic quota for 11106 slots = 520.000
HERE --------> 03/22 11:56:06 Group Table : group group_mcprod quota 520.000
usage 0.000 prio 0.00
03/22 11:56:06 group group_btagging dynamic quota for 11106 slots = 222.000
03/22 11:56:06 Group Table : group group_btagging quota 222.000 usage 0.000
prio 0.00
03/22 11:56:06 group group_dbg dynamic quota for 11106 slots = 55.000
03/22 11:56:06 Group Table : group group_dbg quota 55.000 usage 0.000 prio
0.00
03/22 11:56:06 Group group_alignment - skipping, no submitters
03/22 11:56:06 Group group_calib - skipping, no submitters
03/22 11:56:06 Group group_dqm - skipping, no submitters
03/22 11:56:06 Group group_florida - skipping, no submitters
03/22 11:56:06 Group group_japan - skipping, no submitters
03/22 11:56:06 Group group_karlsruhe - skipping, no submitters
03/22 11:56:06 Group group_mit - skipping, no submitters
03/22 11:56:06 Group group_physmon - skipping, no submitters
03/22 11:56:06 Group group_sam - skipping, no submitters
03/22 11:56:06 Group group_fixedwntest - skipping, no submitters
03/22 11:56:06 Group group_mcprod - negotiating
03/22 11:56:06 Phase 3: Sorting submitter ads by priority ...
03/22 11:56:06 Phase 4.1: Negotiating with schedds ...
03/22 11:56:06 numSlots = 520
03/22 11:56:06 slotWeightTotal = 520.000000
03/22 11:56:06 pieLeft = 520.000
03/22 11:56:06 NormalFactor = 1.000000
03/22 11:56:06 MaxPrioValue = 25528.660156
03/22 11:56:06 NumSubmitterAds = 1
03/22 11:56:06 Negotiating with group_MCprod.vellidis@xxxxxxxx at
<131.225.240.215:38554>
03/22 11:56:06 0 seconds so far
03/22 11:56:06 Calculating submitter limit with the following parameters
03/22 11:56:06 SubmitterPrio = 25528.660156
03/22 11:56:06 SubmitterPrioFactor = 20.000000
03/22 11:56:06 submitterShare = 1.000000
03/22 11:56:06 submitterAbsShare = 1.000000
03/22 11:56:06 submitterLimit = 520.000000
HERE ---------> 03/22 11:56:06 submitterUsage = 3591.000000
03/22 11:56:06 Socket to group_MCprod.vellidis@xxxxxxxx
(<131.225.240.215:38554>) already in cache, reusing
03/22 11:56:06 Sending SEND_JOB_INFO/eom
03/22 11:56:06 Getting reply from schedd ...
03/22 11:56:06 Got JOB_INFO command; getting classad/eom
03/22 11:56:06 Request 17947890.00000:
03/22 11:56:06 matchmakingAlgorithm: limit 520.000000 used 0.000000 pieLeft
520.000000
03/22 11:56:06 Start of sorting MatchList (len=44)
03/22 11:56:06 Finished sorting MatchList
03/22 11:56:06 Connecting to startd glidein_5068@xxxxxxxxxxxxxxxxxxxx
at <131.225.238.42:43337>
03/22 11:56:06 Sending PERMISSION, claim id, startdAd to schedd
03/22 11:56:06 Matched 17947890.0 group_MCprod.vellidis@xxxxxxxx
<131.225.240.215:38554> preempting none <131.225.238.42:43337>
glidein_5068@xxxxxxxxxxxxxxxxxxxx
[cdfcaf@fcdfhead10 /export/condor_local/spool] condor_userprio -getreslist
group_MCprod.vellidis@xxxxxxxx | tail -1
Number of Resources Used: 3579
[cdfcaf@fcdfhead10 /export/condor_local/spool] condor_userprio -getreslist
group_mcprod.vellidis@xxxxxxxx | tail -1
Number of Resources Used: 0
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/