Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] HTCondor and Docker

Date: Tue, 07 Apr 2015 16:02:12 +0100
From: Brian Candler <b.candler@xxxxxxxxx>
Subject: [HTCondor-users] HTCondor and Docker

I'd like to know what's the current state of HTCondor with Docker. I seesome notes at the end of

https://indico.cern.ch/event/320819/session/3/contribution/56/material/slides/0.pptx
but these may just be "wish list" as far as I can tell.

There are three different things I'm thinking of.

(1) Running a HTCondor worker node as a Docker container.

This should be straightforward. All the jobs would run within the samecontainer and therefore have an enforced limit on total resource usage.

This would be a quick way to add HTCondor execution capability to anexisting Docker-aware server, just by

"docker run -d htcondor-worker"
or somesuch.

(2) A "docker universe" where each job instance launches within a newDocker container, from a chosen container template.

When the job starts, a container is created, and when the job terminatesthe container is destroyed (except perhaps on failure, in which case wecan keep it around for post-mortem?)

condor_exec would need to fire off "docker run" (preferably via thedocker API) and track it until the container terminated. Plumbing forstdin/stdout and file transfer would also be required. Hence maybe partof condor_exec itself should run within the container?


Note: in principle it should be possible to combine (1) and (2)
https://blog.docker.com/2013/09/docker-can-now-run-within-docker/

(3) Docker containers on the submit host

A docker container would be a convenient abstraction to use on thesubmission host. Normally when you start a HTCondor DAG you need tocreate an empty working directory, run a script to create the DAG and/orSUB files, run condor_submit_dag, monitor progress to wait forcompletion, check the exit status to see if all DAG nodes completedsuccessfully, fix/restart if necessary, then tidy up the work directory.

Docker on the submission host could handle this lifecycle: the containerwould be the work directory, it would run the scripts you want, submitthe DAG and be visible as a running container until it has completed,and the container itself has an exit status which would show whether theDAG completed succesfully or not, under "docker ps".

https://docs.docker.com/reference/commandline/cli/#filtering_2

When you are finished with the results then you would destroy the container.

This one might be a bit tricky to implement, as I don't see any way tohave condor_submit_dag or condor_submit run in the foreground. I thinkit would be necessary to run "condor_dagman -f" directly as the processwithin the container.

The container also needs to communicate with the condor schedd, and I'mnot sure if it needs access to bits of the filesystem as well (e.g.condor_config). If necessary, /etc/condor/ can be loopback-mounted as avolume within the container.

The user-provided scripts need to be available (e.g. by including theminside a custom docker image from "docker build") and they need to haveparameters passed to them - this could be via environment variables(docker run -e). If the container is restarted, it should re-submit theexisting DAG, not run the scripts to create a new DAG.

If anyone has done any of the above, I'd be very interested to hearabout your experiences.


Regards,

Brian.

Follow-Ups:
- Re: [HTCondor-users] HTCondor and Docker
  - From: Jim White
- Re: [HTCondor-users] HTCondor and Docker
  - From: Greg Thain

Prev by Date: Re: [HTCondor-users] OT Re: Solved: Re: centos 7 problem
Next by Date: Re: [HTCondor-users] Condor held jobs should retry/release after certain configured timeout automatically
Previous by thread: Re: [HTCondor-users] Condor held jobs should retry/release after certain configured timeout automatically
Next by thread: Re: [HTCondor-users] HTCondor and Docker
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

[HTCondor-users] HTCondor and Docker