Re: [HTCondor-devel] Inclusion of additional arguments to docker run


Date: Thu, 03 Sep 2015 10:40:29 -0500
From: Greg Thain <gthain@xxxxxxxxxxx>
Subject: Re: [HTCondor-devel] Inclusion of additional arguments to docker run
On 09/03/2015 03:42 AM, Matthew Hinton wrote:
Hi,

I am working with the docker universe at the moment, and moving a lot of our current processes over to using this code. However, as part of this process, we required volume sharing with the container, since the input / output files used can be several GB, and therefore we don't want to be doing file copies.
I have therefore patched the source in the following, fairly hacky way 
(basically copying the code for docker_image). (I've attached a file, 
let me know if that doesn't work here). This source is now building 
and working as expected, but obviously there is no check performed or 
ability to add multiple volumes etc... for now.
In future it would be useful to be able to specify quite a few of the 
docker run arguments in the condor submit file. I therefore wanted to 
check if work along these lines is being done already, and therefore 
might be released soon, before I continue work along this branch to 
fulfil our requirements.
Good Morning:

We are happy to see that the HTCondor's new docker universe getting attention and interest.
The current set of docker run options exposed by HTCondor is by no means 
set in stone.  We aren't working on any immediate plans to expand them, 
but I would expect that we will over time, based on new docker features, 
and user experience.  Specifically, I would expect the new networking 
support in docker to eventually be exposed by condor, so that containers 
have more options than just NAT-ed network access.
For your use case, it seems that you have machines with large amounts of 
data pre-loaded on them, that you want your containers to be able to 
access -- is this correct?  If so, that means that jobs that request 
docker_volume = foo can only run on certain hosts? When working on 
HTCondor, we try to think about what the responsibility of the job is 
versus the responsibility of the machine.  If a machine has a special 
capability, we like to have it advertise that fact in the startd 
classad, and allow jobs to match against it.  Perhaps for this use case, 
we could add a knob to the startd that allows the administrator to 
configure one or more filesystems that it will volume mount into docker 
containers that request them.  That way, jobs can only match to machines 
that have the data they need, and admins can be more assured that 
containers are contained.
-Greg

[← Prev in Thread] Current Thread [Next in Thread→]