Sorry, I donât follow you. How does this change end up adding runtime=nvidia to the docker command line? -tj From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx>
On Behalf Of JoÃo BaÃto Hi, We have been running HTCondor for a while mainly for Python/MATLAB workloads and we want to start packing our applications into container images however some of them depend on accessing NVIDIA GPUs. NVIDIA has released a container runtime for docker that allows direct access to the GPU without having to pass it to the container. Besides having to install this runtime, docker has to be called with --runtime=nvidia. We could allow users to run their jobs in a vanilla universe and call a job wrapper that eventually calls docker but this opens our servers to security vulnerabilities that we want to avoid. The docker universe already does everything we
need in terms of restricting user permissions and taking care of mounting volumes automatically but lacks the possibility of passing additional arguments. Do you guys think it is possible or feasible to add this option to the docker universe? If I checked the source code correctly, something identical to this might work, // drop unneeded Linux capabilities if (param_boolean("DOCKER_DROP_ALL_CAPABILITIES", true /*default*/, true /*do_log*/, &machineAd, &jobAd)) { runArgs.AppendArg("--cap-drop=all"); // --no-new-privileges flag appears in docker 1.11 if (DockerAPI::majorVersion > 1 || DockerAPI::minorVersion > 10) { runArgs.AppendArg("--no-new-privileges"); } } More info: https://github.com/NVIDIA/nvidia-docker Thanks! JoÃo BaÃto --------------- Scientific Computing and Software Platform Champalimaud Research |