Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] VMs being cleaned up/removed
- Date: Tue, 26 Apr 2005 09:22:25 -0500
- From: Alain Roy <roy@xxxxxxxxxxx>
- Subject: Re: [Condor-users] VMs being cleaned up/removed
I am running Condor version 6.7.2 on Scientific Linux 3.0.3
with 11 dual-cpu worker nodes with 4 VMs each. There are three schedulers
and the CM is using kerberos authentication.
I notice that fairly often, VMs will be "cleaned up" during housecleaning
Try a newer version of Condor.
Condor 6.7.3 has:
This release contains all the bug fixes from the 6.6 stable series upto
and including version 6.6.7, and some of the fixes that will be included
in version 6.6.8. The bug fixes in version 6.6.8 that were not included in
version 6.7.3 are listed in a seperate section of the 6.6.8 version history.
Condor 6.6.8 has:
Fixed issues that would cause condor_ startd to ``disappear'' from the
pool because of dropped machine ad updates. This fix applies to all
platforms, but the symptoms were exhibited predominantly on Windows machines.
And this is one of the bug fixes included in 6.7.3.
So there is a decent shot that this problem will be fixed by upgrading to
Condor 6.7.6, which is the most recent Condor release in the 6.7.x series.
The condor_startd advertises each virtual machine by sending a UDP update
to the collector. In some busy networks, these updates can be lost. If
upgrading doesn't work for you, you can tell Condor to use TCP instead. We
don't use this as a default in order to avoid having hundred of
simultaneous open TCP connections on large pools, but it's certainly
reasonable for your small pool. You can learn how to configure this in the
manual:
http://www.cs.wisc.edu/condor/manual/v6.7/3_11Setting_Up.html#sec:tcp-collector-update
Basically, you do "UPDATE_COLLECTOR_WITH_TCP = TRUE" in your config file.
I hope this helps. If it doesn't, please do let us know. It's not a feature
that machines disappear from your pool!
-alain