Hi folks, I’m wondering if there’s some sort of trick I can use to provide machine attributes via some other mechanism than the startd. For example, one of the constraints on certain jobs might be the amount of disk space available on the output
NFS filesystem – that is, I don’t want to start a job unless there’s at least a certain amount of disk space available on the that filesystem. Similarly, monitoring a FlexLM license server to maintain an attribute for license counts would be useful if it’s
not feasible to dedicate a fixed set of licenses to a concurrency limit. The obvious way to do this is a startd_cron job to check the available space on that filesystem, but the trouble is that this doesn’t scale well – each machine winds up making its own query for a value which will be identical across all
machines in the pool at any given time. While this is not a particular concern for a value such as this which is relatively infrequently queried and changes slowly, the scaling problem gets larger when you want to have a smaller query interval for a more dynamic
value. The alternative would be a schedd_cron job, since that would only run on the scheduler, but then the question is how to get the attributes it generates into a position where they can be evaluated for matching. Perhaps doing something with
condor_advertise in a startd_cron job would be the right approach? Or perhaps there’s something in the Python bindings that could handle this from a central point? Thanks for any suggestions. Michael V. Pelletier |