Hi Benoit,
looks like your wrapper binds the jobdir to /var/lib/condor/execute/dir_67539:/srv- might be, that the ssh session does a nsenter into the job and gets the startds environment + HOME but ends up in the namespace inside the container, where only /srv exists?
As for the exit does the wrapper script maybe interactions with the session exit when closing and catches the exit?
Cheers,
 Thomas
Else there was a recent bugfix [1] that affected condor_ssh_to_job but lsince ssh'ing itself works, it's probaby something else?
[1]
https://opensciencegrid.atlassian.net/browse/HTCONDOR-1245
On 04/11/2022 14.06, Benoit Roland wrote:
Dear all,
we are running the HTCondor EP daemons in an apptainer container and submitting jobs running themselves in a container.
I would like to ask you some questions about the behavior of condor_ssh_to_job.
1) At the beginning of the session, I got the following message.
Welcome to slot1@test-condor@c4p-login-dev!
Your condor job is running with pid(s) 67581.
Cannot chdir to */var/lib/condor/execute/dir_67539*: No such file or directory
Singularity>
Is the message "Cannot chdir to /var/lib/condor/execute/dir_67539: No such file or directory" expected?
The directory exists, the job is executed properly, and I can see in the StarterLog:
11/03/22 17:14:33 (pid:67539) (D_ALWAYS) Using wrapper /scratch/etc/condor/config.git/master/repo/jobwrapper.sh to exec /usr/bin/singularity exec -W /var/lib/condor/execute/dir_67539 --pwd /srv -B */var/lib/condor/execute/dir_67539*:/srv -B /cvmfs:/cvmfs -B /etc/hosts -B /etc/localtime --no-home -C --userns --env SINGULARITY_BIND= --env APPTAINER_BIND= --env APPTAINER_BINDPATH= /cvmfs/unpacked.cern.ch/gitlab-p4n.aip.de:5005/compute4punch/container-stacks/wlcg-wn:latest /srv//condor_exec.exe
11/03/22 17:14:33 (pid:67539) (D_ALWAYS) Create_Process succeeded, pid=67581
2) At the end of the session, I do not succeed to close it properly:
Singularity> exit
exit
logout
read returned, exiting
After that, the exit process is pending, and a CTRL C is needed to close the session.
3) Is there some way to optimise the environment?
ÂÂ Â I was not able to make the completion work, neither the delete, or the navigation in the commands history.
ÂÂÂÂ I guess I am missing something in my configuration of condor_ssh_to_job.
ÂÂThe sshd is started with:
ÂÂ/usr/sbin/sshd -i -e -E /tmp/condor_sshd.log -f /var/lib/condor/execute/dir_67539/.condor_ssh_to_job_1/sshd_config
and I can see in /tmp/condor_sshd.log:
Starting session: forced-command (key-option) '/usr/libexec/condor/condor_ssh_to_job_shell_setup /var/lib/condor/execute/dir_67539/.condor_ssh_to_job_1/env.sh' for benoit_roland from 2a00:139c:3:2e5::12 port 24035 id 0
Is there a way to further configure the environment which is setup by the above command line?
Thanks a lot in advance for your help and reply!
Cheers,
Benoit
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message tohtcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/