Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Bug: interactive jobs + custom job attributes + singularity
- Date: Tue, 06 Mar 2018 16:54:45 +0100
- From: Peter Wienemann <wienemann@xxxxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Bug: interactive jobs + custom job attributes + singularity
Dear all,
the P. S. part of my email is solved thanks to Christoph.
Does anybody have an idea how to address the remaining issues?
Cheers, Peter
On 01.03.2018 18:35, Peter Wienemann wrote:
> Dear HTCondor experts,
>
> we are observing unexpected behaviour in the following situation
> (inspired by
> http://research.cs.wisc.edu/htcondor/manual/v8.6/3_17Singularity_Support.html):
>
> 1. All jobs run in singularity containers (SINGULARITY_JOB = true)
>
> 2. Users can choose the desired OS using a custom job attribute
> "+DesiredOS". The relevant part of the used HTCondor configuration is:
>
> -----------------------------------------------------------------------
> DEFAULT_CENTOS7_IMAGE = /cvmfs/example.com/singularity/CentOS7/default
>
> DEFAULT_SL6_IMAGE = /cvmfs/example.com/singularity/SL6/default
>
> DEFAULT_UBUNTU1604_IMAGE = /cvmfs/example.com/singularity/Ubuntu1604/default
>
> CHOSEN_IMAGE = ifThenElse(TARGET.DesiredOS is "Ubuntu1604",
> "$(DEFAULT_UBUNTU1604_IMAGE)", ifThenElse(TARGET.DesiredOS is "CentOS7",
> "$(DEFAULT_CENTOS7_IMAGE)", "$(DEFAULT_SL6_IMAGE)"))
>
> SINGULARITY_IMAGE_EXPR = $(CHOSEN_IMAGE)
> -----------------------------------------------------------------------
>
> 3. Users can start interactive jobs and should obtain the desired
> runtime environment using
>
> condor_submit -i consel.jdl
>
> where the contents of consel.jdl is
>
> -----------------------------------------------------------------------
> Universe = vanilla
> +DesiredOS = "Ubuntu1604"
> Queue
> -----------------------------------------------------------------------
>
> Unfortunately this does not work. The users always end up in the default
> container OS (SL6 in the above example) as if "DesiredOS" was not defined.
>
> With non-interactive jobs the above configuration works as expected.
>
> Checking the process tree on the execute node, the situation looks like
> this:
>
> -----------------------------------------------------------------------
> [...]
> condor 1676 0.0 0.0 98568 7680 ? Ss Feb25 0:07
> /usr/sbin/condor_master -f
> root 2640 0.1 0.0 28376 8100 ? S Feb25 6:16 \_
> condor_procd -A /var/run/condor/procd_pipe -L /var/log/condor/ProcLog -R
> 1000000 -S 6
> condor 2658 0.0 0.0 78628 6888 ? Ss Feb25 0:07 \_
> condor_shared_port -f -p 9618
> condor 2921 0.1 0.0 84240 10892 ? Ss Feb25 6:48 \_
> condor_startd -f
> condor 45979 0.3 0.0 88388 7916 ? Ss 18:15 0:00 \_
> condor_starter -f -a slot1_1 submit.example.com
> user1 46001 0.0 0.0 19944 796 ? SNs 18:15 0:00
> \_ /usr/libexec/singularity/bin/action-suid /bin/sleep 180
> user1 46008 0.0 0.0 4360 356 ? SN 18:15 0:00
> | \_ /bin/sleep 180
> user1 46022 0.0 0.0 19944 800 ? SNs 18:15 0:00
> \_ /usr/libexec/singularity/bin/action-suid /usr/sbin/sshd -i -e -f
> /pool/condor
> user1 46029 0.0 0.0 70936 2636 ? SN 18:15 0:00
> \_ sshd: user1 [priv]
> user1 46031 0.0 0.0 70936 1212 ? SN 18:15 0:00
> \_ sshd: user1@pts/0
> user1 46032 0.5 0.0 15124 3360 pts/0 SNs+ 18:15 0:00
> \_ -/bin/bash
> [...]
> -----------------------------------------------------------------------
>
> Obviously there are two different containers running: one running
> "sleep" and the other one executing sshd. Checking the file descriptors
> of the corresponding processes yields the following output:
>
> -----------------------------------------------------------------------
> # ls -l /proc/46001/fd
> [...]
> lr-x------. 1 root root 64 1. MÃr 18:15 5 ->
> /cvmfs/example.com/singularity/Ubuntu1604/default
> [...]
> # ls -l /proc/46022/fd
> [...]
> lr-x------. 1 root root 64 1. MÃr 18:16 5 ->
> /cvmfs/example.com/singularity/SL6/default
> [...]
> -----------------------------------------------------------------------
>
> From this information, it is obvious that there are two surprising
> phenomena:
>
> 1. There are *two* containers started.
> 2. The two containers use *different* images indicating that the
> container running sshd ignores the custom job attribute "DesiredOS".
>
> Is there a way to make interactive jobs with the possibility to choose
> singularity images work?
>
> Cheers, Peter
>
> P. S.: Is there a reason why the following command does not work (it
> would be very convenient):
>
> $ condor_submit -i '+DesiredOS = "Ubuntu1604"'
> condor_submit: invalid attribute name '+DesiredOS' for attrib=value
> assigment
>
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/