Hello there. I used the Ubuntu 14.04 deb file to install
HTCondor on two machines on my cluster: one is within the network of the
cluster, with no NIC with an external network, and the other one with a
NIC with an external network. HTCondor ran with no problems with the
first machine but not in the second.
Executing condor_status in the second machine gets me this:
Error: communication error
CEDAR:6001:Failed to connect to <
192.168.1.7:9618>
No
machine in the external network have that IP. I did read the section
3.7.3 of the tutorial regarding the multinic enviroment and made some
modifications in the condor_config file setting BIND_ALL_INTERFACES to
false and NETWORK_INTERFACE to 192.168.0.*. After restarting HTCondor I
still get the same output when I do condor_status. I don't why it keeps
connecting to that IP.
StartLog:
10/22/15 10:13:56 ERROR: SECMAN:2003:TCP connection to collector ---- failed.
10/22/15 10:13:56 Failed to start non-blocking update to <
192.168.1.7:9618>.
10/22/15 10:13:59 attempt to connect to <
192.168.1.7:9618> failed: No route to host (connect errno = 113).
CollectorLog:
10/22/15 10:08:47 stats: Inserting new hashent for 'Collector':'My Pool - ----@----':'192.168.1.135'
10/22/15 10:08:50 attempt to connect to <
192.168.1.7:9618> failed: No route to host (connect errno = 113).
10/22/15 10:08:50 Failed to send update to collector
godzilla.ica.luz.edu.ve.
10/22/15 10:08:50 Unable to send UPDATE_COLLECTOR_AD to all configured collectors
And
the condor service doesn't show up when I do nmap localhost. I do see
it in the other machine. This other machine can run the quickstart
tutorial with no problems.