On Mon March 7 2005 4:18 pm, Prakash Velayutham wrote:
Ian Chesal wrote:
Hi,
I understand that the failover is a feature added in
condor-6.7.x versions. But I don't understand how to enable
this and configure the pool to work with this setup. Can
anyone help? As far as I know, there is nothing in the
documentation. I would like to be corrected in this regard.
See:
http://www.cs.wisc.edu/condor/manual/v6.7.5/8_2Development_Release.html#
SECTION00924000000000000000
The second bullet under "New Features" describes how to define multiple
collectors for failover.
- Ian
Hi Ian,
Thanks. What does the "High Availability" service under new features
section in the same link mean (8.2.6 Version 6.7.0)? It says:
Added a new ``High Availability'' service to the /condor_ master/. You
can now specify a daemon which can have ``fail over'' capabilities (i.e.
the master on another machine can start a matching daemon if the first
one fails). Currently, this is only available over a shared file system
(i.e. NFS), and has only been tested for the /condor_ schedd/.
I was looking to implement that. Is that the same as multiple collectors?
These are separate mechanisms, at least for now. :-( The feature that you
describe above is currently just for schedd fail-over. Separately, in recent
6.7 Condor releases, your pool can now have redundant collectors.
A feature that we very much hope will make the next 6.7 release of Condor will
provide for a fail-over mechanism for negotiators. This is, again, a
different mechanism.
-Nick
So the only place I need to change is still the
$CONDOR_HOME/etc/condor_config file, right? Here I added the IP of the
second collector in the COLLECTOR_HOST variable. Would it be enough to
just restart condor on the second server after doing this? I get some
errors of this kind when I do this...