I notice that the collector shows ConcurrencyLimits attributes in the machine ads for slots which are running ConcurrencyLimited jobs: condor1$ condor_status -any -constraint '!isUndefined(ConcurrencyLimits)' -af MyType ConcurrencyLimits Machine matlab_dce Machine matlab_dce Machine matlab_dce Machine matlab_dce,testsys:2 Machine matlab_dce,testsys:2 condor1$ So maybe the trick to get the negotiator to recognize while-running limit claims is to update the machine ad rather than the job ad? Or update both? The job canât update the machine (startd) ad with condor_chirp once itâs ready for the licensed step, so maybe the job could set a flag attribute in its own ad
such as âNeedMachineConcurrencyLimitsUpdate = Trueâ along with the job adâs ConcurrencyLimits, and the Boolean could be monitored by schedd_cron on the Central Manager which would update the machine ad ConcurrencyLimits attribute for the jobâs slot, as identified
by the GlobalJobId attribute, to match the ConcurrencyLimits string in the job ad, then set âNeedMachineConcurrencyLimitsUpdate = Falseâ which the waiting job would notice and then proceed with the licensed step. (Only after the next negotiation cycle?) Since the concurrency limit would be already set before the licensed step begins, the only risk would be an non-Condor job grabbing the license before the job
noticed its concurrency limit update request had been accepted (but only when a FlexLM feature is shared between Condor and non-Condor users), and the window for this issue could be a bit long since you wouldnât want the job to spam queries of NeedMachineConcurrencyLimitsUpdate
too often while waiting â but you could check the license count with an lmstat just before starting the licensed step and wait/poll if it showed zero. That said, itâs probably easier to just drag the users into condor_dagman for most things as you did, but in continuous-integration build worker jobs, for example,
you either have to do the licensed step within the framework of the build worker, or have the worker sitting around in a slot doing nothing at all while waiting for another job submission it generated with the license concurrency limit to finish running the
licensed step. -Michael Pelletier. From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx]
On Behalf Of Edward Labao Hi Michael! We ran into the same issue a few years ago with user jobs tying up a particularly scarce license for hours before they were actually used. We tested the exact same thing you're thinking of by just running a
condor_qedit on a long running job to update it's concurrency limit attribute, but it didn't look like the negotiator ever gets an update of the concurrency limit. Cheers! |