Yes thanks that was it.
Since I’ve recopied straight the content of the user manual in
the local file I had not checked definitions had the proper extensions:
## The location of executable files
HAD = $(SBIN)/condor_had
REPLICATION = $(SBIN)/condor_replication
All the errors are gone now though at some point the replication
process died and restart on the 2nd central manager. Except for that
it looks alright.
Thx
From:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On
Behalf Of Douglas Clayton
Sent: Tuesday, September 01, 2009 1:28 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Errors with HAD setup
It may not be this, but I suspect your configuration files
for HAD/etc. are missing a ".exe" at the end of them.
Type "condor_config_val -verbose HAD", and change
that line to HAD = $(BIN)\condor_had.exe in the line you get back.
That may not do it, but good luck.
===================================
Douglas Clayton
phone: 919.647.9648
Cycle Computing, LLC
Leader in Condor Grid Solutions
Enterprise Condor Support and Management Tools
http://www.cyclecomputing.com
On Aug 27, 2009, at 8:26 PM, Fabrice Bouye wrote:
Hi,
I am in the process of setting up two central managers using HAD and
replication under Condor 7.2.4 in order to test the procedure before we setup
our entire flock using HAD.
Both central managers are under Windows XP SP2 32-bits and the test clients are
a mix of Windows XP and Linux computers.
On both central manager, I've copied over and modified the configuration files
from http://www.cs.wisc.edu/condor/manual/v7.0/3_10High_Availability.html#SECTION004102400000000000000
\
But I get lots of error related to HAD and replication within the log files:
For example, on the 1st central manager MasterLog file:
8/28 08:05:50 C:\condor/bin/condor_had: Cannot execute
8/28 08:05:50 restarting C:\condor/bin/condor_had in 3600 seconds
8/28 09:01:39 C:\condor/bin/condor_replication: Cannot execute
8/28 09:01:39 restarting C:\condor/bin/condor_replication in 3600 seconds
8/28 09:05:50 C:\condor/bin/condor_had: Cannot execute
8/28 09:05:50 restarting C:\condor/bin/condor_had in 3600 seconds
The second central manager MasterLog file exhibits similar errors.
Is that normal ?
Except for that everything seems working OK so far (client slots are listed by
condor_status and other condor commands seem to work ok).
----
Fabrice Bouyé (http://fabricebouye.cv.fm/)
Fisheries IT Specialist
Tel: +687 26 20 00 (Ext 411)
Oceanic Fisheries, Pacific Community
http://www.spc.int/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
|