[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Docker nvidia runtime support



Sorry, I donât follow you.   How does this change end up adding runtime=nvidia to the docker command line?

 

-tj

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of JoÃo BaÃto
Sent: Friday, July 20, 2018 4:43 AM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] Docker nvidia runtime support

 

Hi,

 

We have been running HTCondor for a while mainly for Python/MATLAB workloads and we want to start packing our applications into container images however some of them depend on accessing NVIDIA GPUs. 

 

NVIDIA has released a container runtime for docker that allows direct access to the GPU without having to pass it to the container. Besides having to install this runtime, docker has to be called with --runtime=nvidia. 

 

We could allow users to run their jobs in a vanilla universe and call a job wrapper that eventually calls docker but this opens our servers to security vulnerabilities that we want to avoid. The docker universe already does everything we need in terms of restricting user permissions and taking care of mounting volumes automatically but lacks the possibility of passing additional arguments.

 

Do you guys think it is possible or feasible to add this option to the docker universe?

 

If I checked the source code correctly, something identical to this might work,

 

                // drop unneeded Linux capabilities

                if (param_boolean("DOCKER_DROP_ALL_CAPABILITIES", true /*default*/,

                                true /*do_log*/, &machineAd, &jobAd)) {

                                runArgs.AppendArg("--cap-drop=all");

                                               

                                // --no-new-privileges flag appears in docker 1.11

                                if (DockerAPI::majorVersion > 1 ||

                                    DockerAPI::minorVersion > 10) {

                                                runArgs.AppendArg("--no-new-privileges");

                                }

                }

 

 

 

 

Thanks!

 

JoÃo BaÃto

---------------

Scientific Computing and Software Platform

Champalimaud Research
Champalimaud Center for the Unknown
Av. BrasÃlia, Doca de PedrouÃos
1400-038 Lisbon, Portugal

fchampalimaud.org