We are having problems getting jobs submitted from
a linux submit host to a windows lab behind a gateway. On the windows
machine, we have errors in the starter log as follows:
0/3 19:10:35 Communicating with shadow <129.128.125.15:37473>
10/3 19:10:35 Submitting machine is
"opteron-cluster.nic.ualberta.ca"
10/3 19:12:34 condor_read(): recv() returned -1, errno = 10054, assuming
failure reading 5 bytes from <129.128.125.15:55548>.
10/3 19:12:34 ERROR "Assertion ERROR on (result)" at line 113 in file
..\src\condor_starter.V6.1\NTsenders.C
10/3 19:12:34 ERROR "LocalUserLog::logStarterError() called before
init()" at line 205 in file ..\src\condor_starter.V6.1\local_user_log.C
On the submit node, in the shadow log,
0/3 19:16:58 Initializing a VANILLA shadow for job 85.0
10/3 19:17:18 (85.0) (13769): condor_read(): timeout reading 5 bytes from
<129.128.237.81:1050>.
10/3 19:17:18 (85.0) (13769): Request to run on
<129.128.237.81:1050> was ACCEPTED
10/3 19:18:06 (85.0) (13769): condor_read(): timeout reading 5 bytes from
<129.128.237.81:1050>.
10/3 19:19:16 (85.0) (13769): condor_read(): recv() returned -1, errno =
104, assuming failure reading 5 bytes from unknown source.
10/3 19:19:16 (85.0) (13769): ERROR "Can no longer talk to condor_starter
<129.128.237.81:1050>" at line 123 in file NTreceivers.C
We have put in holes in the gateway so that there is communication
between the lab and the submit host and the central manager. We can ping
between these machines without any problems and the collector gathers
information about the available machines. However, there is something special
about the submit-execute communication that seems to be blocked by the
gateway. If the gateway is opened up, everything works fine.
Is there anything we can change to condor or to the gateway to make this
work?
Thanks for your time.
Masao
--
Masao Fujinaga
Research Computing
Support
Academic Information and Communication Technologies
(AICT)
University of
Alberta, Edmonton,
Alberta, CANADA T6G
2H1
This communication is intended for the use of the
recipient to which it
is addressed, and
may
contain confidential, personal, and/or
privileged information. Please contact us
immediately
if
you are not the
intended recipient of this
communication. If
you are not the intended
recipient
of this communication, do not copy, distribute, or take
action on it.
Any communication
received
in
error, or subsequent reply, should be deleted
or destroyed