Date: | Wed, 2 Feb 2005 13:05:46 -0500 |
---|---|
From: | "Ian Chesal" <ICHESAL@xxxxxxxxxx> |
Subject: | [Condor-users] Should the schedd/startd's tolerate schedd machine reboots? |
With appropriatly long ALIVE_INTERVAL (the default 300 seconds seems find) and MAX_CLAIM_ALIVES_MISSED (the default of 6 seems fine) I expected startds to tolerate a reasonably fast reboot of a schedd machine and continue to run jobs. I expected the startd to tolerate an outage of up to 30 minutes with the schedd before terminating running jobs. I'm not observing this behaviour though. I'm seeing startds vacate running jobs as soon as the schedd machine goes down. This is on WinXP to WinXP machines with 6.7.3. Is it perhaps due to a shutdown routine in the schedd? As the service is brought down does it reach out to startds to tell it to terminate running jobs? Can I prevent this so reboots are tolerated? Reboots are a necessary evil our windows development environment unfortunatly. - Ian |
[← Prev in Thread] | Current Thread | [Next in Thread→] |
---|---|---|
|
Previous by Date: | [Condor-users] Keeping condor_preen email to a minimum, Ian Chesal |
---|---|
Next by Date: | Re: [Condor-users] Should the schedd/startd's tolerate schedd machine reboots?, Matt Hope |
Previous by Thread: | RE: [Condor-users] Should the schedd/startd's tolerate schedd machinereboots?, Alain Roy |
Next by Thread: | Re: [Condor-users] Should the schedd/startd's tolerate schedd machine reboots?, Matt Hope |
Indexes: | [Date] [Thread] |