Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Problem with REPLICATION_USE_SHARED_PORT
- Date: Fri, 10 Mar 2017 17:07:14 +0100
- From: Andrea Sartirana <sartiran@xxxxxxxxxxxx>
- Subject: [HTCondor-users] Problem with REPLICATION_USE_SHARED_PORT
Hi,
at GRIF we are currently testing HAD and Replication.
Things work just fine when declaring special ports for HAD and
REPLICATION. But, when setting REPLICATION_USE_SHARED_PORT to TRUE, the
replication service refuses to start and I see errors like these in the
Master log of the 2 master servers
03/10/17 17:01:26 ERROR: SharedPortEndpoint: failed to bind to
15f287e5db818c2dbce9638b70a6dc044992f0be80d2dc43848c983c1fc43fa5/MASTER:
Address already in use
03/10/17 17:01:26 ERROR: Create_Process failed trying to start
/usr/sbin/condor_replication
03/10/17 17:01:26 restarting /usr/sbin/condor_replication in 265 seconds
Below [1] my HAD/REPLICATION configuration.
.... What am I doing wrong?
Thanks,
Andrea
[1]
HAD_USE_SHARED_PORT = TRUE
REPLICATION_USE_SHARED_PORT = TRUE
REPLICATION_LIST = lpnhe-gs9088.in2p3.fr:$(SHARED_PORT_PORT)
llrmpicream.in2p3.fr:$(SHARED_PORT_PORT)
HAD_LIST = lpnhe-gs9088.in2p3.fr:$(SHARED_PORT_PORT)
llrmpicream.in2p3.fr:$(SHARED_PORT_PORT)
HAD_CONTROLLEE = NEGOTIATOR
HAD_CONNECTION_TIMEOUT = 10
HAD_USE_PRIMARY = true
DAEMON_LIST = $(DAEMON_LIST) HAD REPLICATION
HAD_USE_REPLICATION = true
STATE_FILE = $(SPOOL)/Accountantnew.log
REPLICATION_INTERVAL = 300
MAX_TRANSFER_LIFETIME = 300
HAD_UPDATE_INTERVAL = 300
MASTER_NEGOTIATOR_CONTROLLER = HAD
MASTER_HAD_BACKOFF_CONSTANT = 360