Hi
thanks for the answer.
the cluster is running condor 8.6.13. We have completed the migration from CREAM to CONDOR-CE, so we have 3 SCHEDDs at 8.8.10
After a long debug session, I have[I believe] evidence the issue is related with the SLOT_WEIGHT. It seems multicore jobs submitted through the condor-ce are not evaluated in a weighted manner by the Negotiator.
We have scheduled a HTcondor upgrade
[root@htc-ctl-01 ~]# condor_userprio Last Priority Update: 11/23 08:09 Group Config Use Effective Priority Res Total Usage Time Since Requested User Name Quota Surplus Priority Factor In Use (wghted-hrs) Last Usage Resources ------------------------------- --------- ------- ------------ --------- ------ ------------ ---------- ---------- group_cms.grid 0.80 ByQuota 1000.00 3288 241753.52 <now> 527
[root@htc-ctl-01 ~]# tail -f /var/log/condor/NegotiatorLog| grep cms 11/23/20 08:10:18 Group group_cms.locmcore - skipping, zero slots allocated 11/23/20 08:10:18 Group group_cms.grid - skipping, at or over quota (quota=538.023) (usage=3288) (allocation=527)
[root@ce-05 ~]# condor_q -global -c "x509UserProxyVOName == \"cms\""| grep "Total for query" Total for query: 176 jobs; 0 completed, 10 removed, 40 idle, 126 running, 0 held, 0 suspended Total for query: 190 jobs; 1 completed, 9 removed, 40 idle, 140 running, 0 held, 0 suspended Total for query: 196 jobs; 0 completed, 14 removed, 37 idle, 145 running, 0 held, 0 suspended [root@ce-05 ~]#
[root@htc-ctl-01 ~]# condor_q -global -c "x509UserProxyVOName == \"cms\"" -af RequestCpus 8 8 8 8
Ale
The message that the job has "not been considered by the
matchmaker" is both misleading and unhelpful has been removed from
our code base. Something else is causing the issue. ..Tim
On 11/16/20 7:58 AM, Alessandro
Italiano wrote:
Hi
I have same jobs belonging to an AccountingGroup
used to run multicore jobs which are stacking in Idle status.
condor_q -analyze reports the jos as ânot considered
by the matchmakerâ
Actually in the Negotiator log file there are not
messages reporting the AccountingGroup is going to be Negotiated
condor_userprio -cleanall does not resolve the
issue.
Debug info have not helped
How can I understand whey the Negotiator is not
evaluating those jobs ?
thanks in advance
Ale
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
--
Tim Theisen
Release Manager
HTCondor & Open Science Grid
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736
|