[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Capabilities of schedd HA



I've been playing with schedd HA. I haven't quite gotten the configuration right, but before I put any more time into it, I want to make sure that it can do what I'm hoping it can.
Those who have read/replied to my earlier posts will recall that my 
Condor setup must have no single point of failure. I'm currently working 
on schedd. Schedd now runs on the CMs and needs only to take submissions 
from the active CM. I've tested CM failover while a job was executing. 
While negotiator did failover, the job was not able to complete until 
failback. Is there any way around this (e.g. shared file system between 
CMs)? Or am I misunderstanding these mechanisms? I would really like to 
be able to implement CM and schedd failover that is transparent to job 
completion.
Thanks,
Janzen