Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] 7.4.2 / 7.4.4: condor_q trouble when pool PCs suddenly are powered off !?!
- Date: Mon, 22 Nov 2010 05:19:00 -0800 (PST)
- From: Rob <spamrefuse@xxxxxxxxx>
- Subject: [Condor-users] 7.4.2 / 7.4.4: condor_q trouble when pool PCs suddenly are powered off !?!
Hi,
I have a linux (Fedora 12) condor master with condor version 7.4.2.
The Windows XP pool PCs are all running condor version 7.4.4.
The condor master is having trouble to produce the condor_q output at times when
the pool PCs are switched off:
==============
$ condor_q
11/22 22:04:05 condor_read(): timeout reading 5 bytes from schedd at
<115.105.120.71:60614>.
11/22 22:04:05 IO: Failed to read packet header
11/22 22:04:05 SECMAN: reconnected to schedd at <115.105.120.71:60614> from port
52251 to send unauthenticated command 1111 QMGMT_CMD
11/22 22:04:26 condor_read(): timeout reading 5 bytes from schedd at
<115.105.120.71:60614>.
11/22 22:04:26 IO: Failed to read packet header
11/22 22:04:46 condor_read(): timeout reading 5 bytes from schedd at
<115.105.120.71:60614>.
11/22 22:04:46 IO: Failed to read packet header
-- Failed to fetch ads from: <115.105.120.71:60614> : condor.dns.org
==============
The pool PC is a set of over 300 public library Windows XP PCs, which are all
centrally powered off at the same time in the evening. For a while the condor
master keeps hanging on to the PCs' status before the poweroff (understandably,
as it has no clue what has happened to the "disappeared" PCs). After a while the
condor master then abandons whatever was going on on the PCs. During the
transition time, the condor_q command seems to have trouble producing useful
output.
Is this a "feature" or a bug?
Thanks,
Rob.