Those who have read/replied to my earlier posts will recall that my Condor setup must have no single point of failure. I'm currently working on schedd. Schedd now runs on the CMs and needs only to take submissions from the active CM. I've tested CM failover while a job was executing. While negotiator did failover, the job was not able to complete until failback. Is there any way around this (e.g. shared file system between CMs)? Or am I misunderstanding these mechanisms? I would really like to be able to implement CM and schedd failover that is transparent to job completion.
Thanks, Janzen