[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Running a mixed pool from a jobs perspective (WNs with Docker and pilots without Docker)



Hi guys,

We are trying to run jobs in a pool with both local worker nodes and pilots coming in from off site. Our local workers have docker and we want to run all jobs running on the local cluster inside containers. To do this, we add the following classAds to the job JDL:

WantDocker = TRUE
DockerImage = "xyz"

Our general idea with this setup is for the jobs to be able to run wherever they can find resources i.e. be it off site or on site. These jobs run fine when they run on the local workers with docker but are kicked off when they try to run on pilots without docker support.

Is there a way for us to force all local running jobs to run on docker from the startd side alone? This way the jobs won't have to specify that they want docker and can run on pilots too. At the same time we will be able to force all jobs trying to run on the local cluster to always use docker.

So far I have tried using 'JobMachineAttrs' to fetch 'HasDocker' in the job classAd, but by the time the corresponding 'MachineAttrHasDocker0' gets populated the job is already running. I also tried referencing HasDocker directly from WantDocker in the job classAds, but the classAd isn't evaluated with reference to the startd classAds. I know this can probably be done by putting the local on site workers behind a CE but this is something we are trying to get away from.

Any other ideas or suggestions are welcome.

Best regards,
Farrukh