Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Strange scheduling behavior in 6.8.0
- Date: Wed, 16 Aug 2006 18:16:05 -0500
- From: Erik Paulson <epaulson@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Strange scheduling behavior in 6.8.0
On Wed, Aug 16, 2006 at 10:40:45AM -0700, Michael S. Root wrote:
>
> If I run "condor reschedule -all", it will
> send the "Reschedule" command to only those 10 or so machines that are
> actually running jobs.
>
condor_reschedule doesn't talk to execute machines - it only talks to the
schedd. Furthermore, condor_reschedule -all is not useful.
condor_reschedule sends a command to the schedd that says 'please start
a matchmaking cycle in the pool.'. The schedd in turn contacts the central
manager and says "please start a matchmaking cycle in pool." - so
'condor_reschedule -all' means "send a message to all the schedds in the pool
to ask them all to contact the central manager and ask for another matchmaking
cycle in the pool. Therefore, if you have N schedds, and you use
'condor_reschedule -all', your central manager gets N-1 more requests than it
needs to start a new matchmaking cycle in the pool.
How long are you waiting while jobs are "stuck?" If after 5 minutes, you
give a single 'condor_reschedule' (without the -all), do they get unstuck?
-Erik