Hi Greg,
> This error looks like a recursive mount.? If you remove the getenv =
true and the SINGULARITY_PWD, does it work?
This was it -- removing those parameters allowed the container to run. Thanks for the help!
Bryce
----- Bryce Cousins
LIGO R&D Engineer
Penn State Institute for Computational
and Data Sciences
bfc5288@xxxxxxx 814-867-3035 From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of htcondor-users-request@xxxxxxxxxxx <htcondor-users-request@xxxxxxxxxxx>
Sent: Wednesday, September 23, 2020 09:39 To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx> Subject: HTCondor-users Digest, Vol 82, Issue 26 Date: Tue, 22 Sep 2020 14:50:21 -0500
From: Greg Thain <gthain@xxxxxxxxxxx> To: htcondor-users@xxxxxxxxxxx Subject: Re: [HTCondor-users] Singularity container creation failed: can't remount /user/path: no such file or directory Message-ID: <ab763418-4283-faf1-fcad-499d985274bc@xxxxxxxxxxx> Content-Type: text/plain; charset="windows-1252"; Format="flowed" Bryce: I don't think this is the problem, but be aware that we've seen problems with singularity when we've explicitly set the :rw permissions for a mount.? Some versions seem to only work with either :ro or nothing (to mean :rw). This error looks like a recursive mount.? If you remove the getenv = true and the SINGULARITY_PWD, does it work? -greg On 9/18/20 9:04 AM, Cousins, Bryce S wrote: > Hello, > > I run a HTCondor pool v8.8.10 and would like to enable user-defined > Singularity images for submitted jobs, but when submitting test jobs > I'm running into issues of the form: > > ??? FATAL: ? container creation failed: mount ->/user/path error: > can't remount /user/path: no such file or directory > > The Singularity image itself is fine, and executes without issues on > the login or compute nodes; the error occurs only when submitting an > HTCondor Singularity job. > > I set up the Singularity compute nodes with the following > configuration, based on the Singularity Support docs > <https://nam01.safelinks.protection.outlook.com/?url="">>: > > # /etc/condor/config.d/70-singularity.conf > SINGULARITY_JOB = !isUndefined(TARGET.SingularityImage) > SINGULARITY_IMAGE_EXPR = TARGET.SingularityImage > SINGULARITY_TARGET_DIR = /srv > SINGULARITY_BIND_EXPR = > "/cvmfs,/ligo/home/ligo.org:/ligo/home/ligo.org:rw,/localscratch:/localscratch:rw" > SINGULARITY_IS_SETUID = false > > HAS_SINGULARITY = HasSingularity > STARTD_ATTRS = $(STARTD_ATTRS),HAS_SINGULARITY > > A test submit file is: > > # /ligo/home/ligo.org/bryce.cousins/workflows/singularity_condor/test.sub > universe = vanilla > executable = > /ligo/home/ligo.org/bryce.cousins/workflows/singularity_condor/containerInfo.sh > getenv = True > environment = > "SINGULARITY_PWD=/ligo/home/ligo.org/bryce.cousins/git.ligo/gstlal/tacc" > +SingularityImage = > "/ligo/home/ligo.org/bryce.cousins/workflows/singularity_condor/gstlal.simg" > error = $(cluster)-$(process).err > queue 1 > > Submitting this job leads to an error: > FATAL: ? container creation failed: mount > ->/ligo/home/ligo.org/bryce.cousins/workflows/singularity_condor > error: can't remount > /ligo/home/ligo.org/bryce.cousins/workflows/singularity_condor: no > such file or directory > > I'm not sure the root cause, since the `/ligo/home/ligo.org/` NFS > directory is bound in the compute node config. Other changes I have > tried that still cause the same FATAL error: > > * binding the full path on the compute node, which warns > "destination is already in the mount point list" > * removing the environment variables from the submit file > > Is there some other configuration change (either in the .sub file or > on the compute node) that would work? > > Thank you for any guidance. > > Bryce > > ----- > > Bryce Cousins > > LIGO R&D Engineer > > Penn State Institute for Computational and Data Sciences > <https://www.icds.psu.edu/> > > bfc5288@xxxxxxx > > 814-867-3035 > > > _______________________________________________ > HTCondor-users mailing list > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://nam01.safelinks.protection.outlook.com/?url=""> > > The archives can be found at: > https://nam01.safelinks.protection.outlook.com/?url=""> |