It seems changing job priority using condor_prio makes HTCondor matchmaker to consider a job cluster for scheduling again. From: Vaurynovich, Siarhei Hello, Please, let me know if there is a way to force HTCondor matchmaker to consider a job cluster for scheduling.
My jobs often sit unscheduled in the queue for many hours (indefinitely) if I use condor_qedit to adjust job requirements.
To make sure jobs have enough RAM to run, I sometimes restrict allowed SlotID range in requirements. There is probably a better way to do it: i.e. somehow to declare RAM as a shared resource with certain number of units of the resource
available, but for now this is my quick hack to do it. Setting ImageSize does not work since my jobs are almost always bigger than per slot RAM and so if I give realistic job size, my jobs would never start. Creating specialized slots is also a bad idea since
my jobs vary strongly in size. The problem is that often after such adjustment, my jobs would often stop being scheduled for running – they sit in the queue indefinitely and ‘condor_q -better-analyze clusterID’ gives “Job has not yet been considered by the matchmaker.”
while claiming that there are slots “available to run your job”. If I do not use condor_qedit, jobs run fine. If I kill the same jobs and then submit them again with new requirements, they also run fine. Thank you very much for your help, Siarhei. ............................................................................ Trading instructions sent electronically to Bernstein shall not be deemed For further important information about AllianceBernstein please click here |