Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Struggling with Condor Flocking
- Date: Tue, 21 Jun 2005 20:49:03 +0530
- From: "Aln Sai Srinivas - CTD, Chennai" <alnsai@xxxxxxxxxxx>
- Subject: [Condor-users] Struggling with Condor Flocking
Hi
I'm trying flocking between condor pools. I'm getting an error in
CollectorLog "DC_AUTHENTICATE: attempt to open invalid session
cm.mygrid:27485:1119363742:5, failing" where cm is the host name of
CentralManager of a pool that flocks to.
And flocking never happened.
Here is the scenario...
I'm using Redhat Linux and condor 6.6.9
There are two central managers cm.mygrid.com and cm1.mygrid.com represent
two condor polls respectively..
condor_config is on shared file system.
Here is the configuration
======================================================================
$LOCAL_DIR/condor_config.local for cm.mygrid.com
=======================================================================
COLLECTOR = $(SBIN)/condor_collector
DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR, STARTD, SCHEDD
COLLECTOR_NAME = Collector at alpha
CONTINUE = True
FILESYSTEM_DOMAIN = cm.mygrid.com
FLOCK_FROM = *.hclgrid.com
FLOCK_TO = cm1.mygrid.com
PREEMPT = FALSE
SUSPEND = FALSE
LOCK = /tmp/condor-lock.$(HOSTNAME)0.885447545050742
UID_DOMAIN = cm.mygrid.com
NEGOTIATOR = $(SBIN)/condor_negotiator
VACATE = FALSE
CONDOR_ADMIN = root@xxxxxxxxxxxxx
START = TRUE
MAIL = /bin/mail
CONDOR_IDS = 504.504
RELEASE_DIR = /usr/local/condor
CONDOR_HOST = cm.mygrid.com
LOCAL_DIR = /usr/local/condor/local.$(HOSTNAME)
======================================================================
$LOCAL_DIR/condor_config.local for cm1.mygrid.com
=======================================================================
COLLECTOR = $(SBIN)/condor_collector
DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR, STARTD, SCHEDD
COLLECTOR_NAME = Collector at microgrid
CONTINUE = True
FILESYSTEM_DOMAIN = cm1.mygrid.com
FLOCK_FROM = *.hclgrid.com
FLOCK_TO = Not defined
PREEMPT = FALSE
SUSPEND = FALSE
LOCK = /tmp/condor-lock.$(HOSTNAME)0.505710791637288
UID_DOMAIN = cm1.mygrid.com
NEGOTIATOR = $(SBIN)/condor_negotiator
VACATE = FALSE
CONDOR_ADMIN = root@xxxxxxxxxxxxxx
START = TRUE
MAIL = /bin/mail
CONDOR_IDS = 504.504
RELEASE_DIR = /usr/local/condor
CONDOR_HOST = cm1.mygrid.com
LOCAL_DIR = /usr/local/condor/local.$(HOSTNAME)
============================================================================
===
CollectorLog at cm1.mygrid.com
============================================================================
=====
6/21 20:03:42 WARNING: No master ad for < cm.mygrid.com >
6/21 20:03:42 ScheddAd : Inserting ** "< cm.mygrid.com , 10.100.207.10 >"
6/21 20:03:42 stats: Inserting new hashent for
'Schedd':'cm.mygrid.com':'10.100.207.10'
6/21 20:03:42 SubmittorAd : Inserting ** "< condor@xxxxxxxxxxxxx ,
10.100.207.10 >"
6/21 20:03:42 stats: Inserting new hashent for
'Submittor':'condor@xxxxxxxxxxxxx':'10.100.207.10'
6/21 20:04:09 Got QUERY_STARTD_ADS
6/21 20:04:09 (Sent 1 ads in response to query)
6/21 20:06:33 (Sent 5 ads in response to query)
6/21 20:06:33 Got QUERY_STARTD_PVT_ADS
6/21 20:06:33 (Sent 1 ads in response to query)
============================================================================
===
CollectorLog at cm.mygrid.com
============================================================================
=====
6/21 20:02:21 DC_AUTHENTICATE: attempt to open invalid session
alpha:27485:1119363742:5, failing.
6/21 20:03:21 SubmittorAd : Inserting ** "< condor@xxxxxxxxxxxxx ,
10.100.207.10 >"
6/21 20:03:21 stats: Inserting new hashent for
'Submittor':'condor@xxxxxxxxxxxxx':'10.100.207.10'
6/21 20:03:21 (Sent 4 ads in response to query)
6/21 20:03:21 Got QUERY_STARTD_PVT_ADS
6/21 20:03:21 (Sent 1 ads in response to query)
6/21 20:03:41 (Sent 4 ads in response to query)
6/21 20:03:41 Got QUERY_STARTD_PVT_ADS
6/21 20:03:41 (Sent 1 ads in response to query)
6/21 20:03:49 DC_AUTHENTICATE: attempt to open invalid session
alpha:27485:1119363829:6, failing.
============================================================================
===
ScheddLog at cm.mygrid.com
============================================================================
=====
6/21 20:03:20 DaemonCore: Command received via UDP from host
<10.100.207.10:32990>
6/21 20:03:20 DaemonCore: received command 421 (RESCHEDULE), calling handler
(reschedule_negotiator)
6/21 20:03:21 Sent ad to central manager for condor@xxxxxxxxxxxxx
6/21 20:03:21 Called reschedule_negotiator()
6/21 20:03:21 DaemonCore: Command received via TCP from host
<10.100.207.10:41493>
6/21 20:03:21 DaemonCore: received command 416 (NEGOTIATE), calling handler
(negotiate)
6/21 20:03:21 Negotiating for owner: condor@xxxxxxxxxxxxx
6/21 20:03:21 Checking consistency running and runnable jobs
6/21 20:03:21 Tables are consistent
6/21 20:03:21 Out of jobs - 1 jobs matched, 0 jobs idle, flock level = 0
6/21 20:03:21 DaemonCore: Command received via UDP from host
<10.100.207.10:32991>
6/21 20:03:21 DaemonCore: received command 421 (RESCHEDULE), calling handler
(reschedule_negotiator)
6/21 20:03:21 Called reschedule_negotiator()
6/21 20:03:23 Started shadow for job 558.0 on "<10.100.207.10:41475>",
(shadow pid = 27559)
6/21 20:03:25 Sent ad to central manager for condor@xxxxxxxxxxxxx
6/21 20:03:42 Activity on stashed negotiator socket
6/21 20:03:42 Negotiating for owner: condor@xxxxxxxxxxxxx
6/21 20:03:42 Checking consistency running and runnable jobs
6/21 20:03:42 Tables are consistent
6/21 20:03:42 Out of servers - 0 jobs matched, 1 jobs idle, 1 jobs rejected
6/21 20:03:42 Increasing flock level for condor@xxxxxxxxxxxxx to 1.
6/21 20:03:42 Sent ad to central manager for condor@xxxxxxxxxxxxx
6/21 20:04:34 Shadow pid 27559 for job 558.0 exited with status 100
6/21 20:04:34 Started shadow for job 559.0 on "<10.100.207.10:41475>",
(shadow pid = 27569)
Could you plz help me where I'm missing..?
Regards
Sai
DISCLAIMER
This message and any attachment(s) contained here are information that is confidential, proprietary to HCL Technologies
and its customers. Contents may be privileged or otherwise protected by law. The information is solely intended for the
individual or the entity it is addressed to. If you are not the intended recipient of this message, you are not authorized to
read, forward, print, retain, copy or disseminate this message or any part of it. If you have received this e-mail in error,
please notify the sender immediately by return e-mail and delete it from your computer