[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Drain HTCondor worker by setting instance metadata value



Thank you all for your replies. I have gotten some ideas to work with, however I have yet not arrived at the solution Iâm looking for.

I will give a bit more insight into my use case to clarify things.

Iâm creating an extension to an OpenStack component that should be able to autonomously scale up/down the amount of workers in a HTCondor cluster. The set-up consists of a central manager with a publicly accessible IP, and a set of workers with private IPs only. I need a mechanism to have HTCondor temporarily stop sending jobs to the workers.

In this setup, providing full access to the central manager from the cloud component Iâm extending would require a ssh private key to be uploaded into the cloud infrastructure. This is not an ideal solution. I would prefer to only work by interfacing to HTCondor. Using shared secret authentication and handling everything through the central manager could also be a possibility if this is possible.

The solution should not affect running jobs, only cause the workers to temporarily stop accepting new jobs.

I would like to avoid having any daemons running constantly on all the workers as this would take up resources. I tried to have a cron job running every 10 seconds and this increased the CPU usage by 2%. Increasing the interval helps this but gives a less responsive solution. Is it possible to make cron jobs stop running when a worker is busy?


Steve C Timm: yes, you can update OpenStack metadata on running instances