Re: [HTCondor-devel] Inclusion of additional arguments to docker run


Date: Thu, 03 Sep 2015 10:40:29 -0500
From: Greg Thain <gthain@xxxxxxxxxxx>
Subject: Re: [HTCondor-devel] Inclusion of additional arguments to docker run
On 09/03/2015 03:42 AM, Matthew Hinton wrote:
Hi,

I am working with the docker universe at the moment, and moving a lot of our current processes over to using this code. However, as part of this process, we required volume sharing with the container, since the input / output files used can be several GB, and therefore we don't want to be doing file copies.

I have therefore patched the source in the following, fairly hacky way (basically copying the code for docker_image). (I've attached a file, let me know if that doesn't work here). This source is now building and working as expected, but obviously there is no check performed or ability to add multiple volumes etc... for now.

In future it would be useful to be able to specify quite a few of the docker run arguments in the condor submit file. I therefore wanted to check if work along these lines is being done already, and therefore might be released soon, before I continue work along this branch to fulfil our requirements.


Good Morning:

We are happy to see that the HTCondor's new docker universe getting attention and interest.

The current set of docker run options exposed by HTCondor is by no means set in stone. We aren't working on any immediate plans to expand them, but I would expect that we will over time, based on new docker features, and user experience. Specifically, I would expect the new networking support in docker to eventually be exposed by condor, so that containers have more options than just NAT-ed network access.

For your use case, it seems that you have machines with large amounts of data pre-loaded on them, that you want your containers to be able to access -- is this correct? If so, that means that jobs that request docker_volume = foo can only run on certain hosts? When working on HTCondor, we try to think about what the responsibility of the job is versus the responsibility of the machine. If a machine has a special capability, we like to have it advertise that fact in the startd classad, and allow jobs to match against it. Perhaps for this use case, we could add a knob to the startd that allows the administrator to configure one or more filesystems that it will volume mount into docker containers that request them. That way, jobs can only match to machines that have the data they need, and admins can be more assured that containers are contained.

-Greg

[← Prev in Thread] Current Thread [Next in Thread→]