Assuming that youâre sbatching a starter for the SLURM node, which then joins the pool and gets matched to the pending job, you can set up a starter attribute
such as âExpirationTimeâ indicating the UNIX timestamp at which the worker will terminate. So say you sbatch an HTCondor starter SLURM job with --time=6:00:00 for a six-hour lifetime. Youâd then calculate what now plus six hours would be via â$(($(date
+%s) + (6 * 3600))) in bash, for example, which for me right now is 1694457722, and set that as the ExpirationTime for the SLURM-launched starter. ExpirationTime = 1694457722 You could also set a RunimeRemaining _expression_ like so based on the ExpirationTime value: RuntimeRemaining = ExpirationTime - time() Then your requirements _expression_ could easily match to machines with enough RuntimeRemaining to satisfy the jobâs EstimatedRuntime: EstimatedRuntime = 4 * 3600 Requirements = TARGET.RuntimeRemaining > EstimatedRuntime If a machine doesnât have a RuntimeRemaining, such as for a dedicated node, youâd want to be able to match to both, so youâd want to check for the attribute: Requirements = isUndefined(TARGET.RuntimeRemaining) \ ? TRUE \ : TARGET.RuntimeRemaining > MY.EstimatedRuntime Hopefully this proves helpful. Michael Pelletier From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx>
On Behalf Of Seung-Jin Sul Hi, I am using SLURM nodes to create pools of HTCondor workers and I am running a separate service that watches `condor_q` and executes `sbatch` or `scacncel` on demand. What I am trying to do is pass a runtime constraint for a task to HTCondor so that it can schedule the task to the SLURM node that has enough life left (enough wallclock time left). For example, if a task needs more than 1hr estimated runtime, I want to let HTCondor schedule the task to any SLURM nodes that have more than 1hr life time. Anyone has done it? Any ideas will be appreciated. Thank you! Best regards, Seung |