Vikrant,
It looks like around when the Schedd was failing, it was communicating to the negotiator and continued to send jobs. I would look at the NegotiatorLog at that timestamp to see if there is any helpful information there. Also, is "test.example.com" the real name
or did you sanitize the log?
Best,
Joe Reuss
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Vikrant Aggarwal <ervikrant06@xxxxxxxxx>
Sent: Monday, October 30, 2023 4:11 PM To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Subject: [HTCondor-users] Scheduling delay in cluster mix of 8.8.5 and 9.0.17 version Hello Experts,
Sched running 9.0.17 version.
HTcondor masters running 8.8.5 version (Primary and all in flock_to list)
Special setup details: We are dynamically modifying the job requirements to give it an opportunity first to run on private pool (team owned pool) if not then on public pool (which is shared by multiple teams) ensuring we are not creating too many autoclusterIDs.
Despite having available cores in both primary and flock pools, the job stays in queue forever until we do the restart of condor service on scheduler.
Sched doesn't present the jobs for matchmaking.
In sched logs, the following message was reported but still after this message it was keep on sending jobs to negotiator for matchmaking.
Thanks & Regards,
Vikrant Aggarwal
|