Hi guys,
We are trying to run jobs in a pool with both local worker nodes and
pilots coming in from off site. Our local workers have docker and we
want to run all jobs running on the local cluster inside containers. To
do this, we add the following classAds to the job JDL:
WantDocker = TRUE
DockerImage = "xyz"
Our general idea with this setup is for the jobs to be able to run
wherever they can find resources i.e. be it off site or on site. These
jobs run fine when they run on the local workers with docker but are
kicked off when they try to run on pilots without docker support.
Is there a way for us to force all local running jobs to run on
docker from the startd side alone? This way the jobs won't have to
specify that they want docker and can run on pilots too. At the same
time we will be able to force all jobs trying to run on the local
cluster to always use docker.
So far I have tried using 'JobMachineAttrs' to fetch 'HasDocker'
in the job classAd, but by the time the corresponding
'MachineAttrHasDocker0' gets populated the job is already running. I
also tried referencing HasDocker directly from WantDocker in the job
classAds, but the classAd isn't evaluated with reference to the startd
classAds. I know this can probably be done by putting the local on site
workers behind a CE but this is something we are trying to get away
from.
Any other ideas or suggestions are welcome.
Best regards,
Farrukh