[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] condor_submit -addr doesn't work when sched is behind a shared port



Hi,

I can't get condor_submit -addr to work when condor_schedd is behind a condor_shared_port.

Output from condor_submit is below

sh-4.2$ condor_submit -debug -addr "<172.1.3.3:9618>" job.sub
Submitting job(s)07/04/18 10:06:49 condor_read() failed: recv(fd=4) returned -1, errno = 104 Connection reset by peer, reading 5 bytes from schedd at <172.1.3.3:9618>.
07/04/18 10:06:49 IO: Failed to read packet header
07/04/18 10:06:49 SECMAN: no classad from server, failing

ERROR: Failed to connect to local queue manager
SECMAN:2007:Failed to end classad message.

Error message written to /var/log/condor/SharedPortLog in the schedd container is 

07/04/18 10:06:49 SharedPortServer: server was busy, failed to connect collector as requested by <172.1.3.3:46528>: primary (7d2cc1f5fc7f6a4e2eb39facb9bb27877fdd809e4b7fa28fd830cd99c77172ee/collector): Connection refused (111); alt (/var/lock/condor/daemon_sock/collector): Connection refused (111)

Nothing is written to /var/log/condor/SchedLog

Why is condor_submit even trying to access the collector when -addr is meant to tell it to connect straight to the sched? Is there is a bug in condor_submit that means it asks the shared_port_daemon to connect to the the collector, not the sched, even when the -addr option it set?

Everything works fine when sched isn't running behind a condor_shared_port, so I've worked round this issue by simply not using a shared port.

Relevant versions are

sh-4.2$ condor_version
$CondorVersion: 8.6.11 May 10 2018 BuildID: 440910 $
$CondorPlatform: x86_64_RedHat7 $

Relevant files are 

sh-4.2$ cat job.sub
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
Universe = vanilla
Executable = /bin/bash
Arguments = test.sh
Log = job.log
Output = job.out
Error = job.error
transfer_input_files = test.sh
Queue
sh-4.2$
sh-4.2$
sh-4.2$ cat test.sh
echo Starting test.sh
whoami
id
hostname
/usr/sbin/ip a
echo Ending test.sh
sh-4.2$

I'm running HTCondor in a container on Kubernetes, but doubt that is relevant to this problem.

Thanks,

Rob