Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] condor_shadow timeout when loosing contact with startd
- Date: Mon, 26 Jan 2004 13:43:56 -0600
- From: Zachary Miller <zmiller@xxxxxxxxxxx>
- Subject: Re: [condor-users] condor_shadow timeout when loosing contact with startd
On Mon, Jan 26, 2004 at 01:26:45PM -0600, Geoff Lovett wrote:
> Hello, I've noticed that when condor_shadow looses contact with
> condor_startd on an execute machine, it typically takes roughly 2 hours
> for the shadow to notice that the startd is gone and cause an exception,
> thereby putting the job back into the queue. My question is, can this
> timeout be configured?
i think you mean the condor_starter and not the condor_startd. the
starter is the daemon which launches and manages the job on the execute
machine.
by default it sends an update every 20 minutes and then shadow should
except after 3 missed updates, i.e. one hour. i'm not sure why it is
taking 2 hours for you... maybe i'm wrong about something.
anyhow, this is configurable via the condor_config:
SHADOW_UPDATE_INTERVAL = 300
# 300 seconds == 5 minutes
cheers,
-zach
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>