Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Windows XP computer matched but idle
- Date: Mon, 19 Nov 2012 12:52:29 -0600
- From: brad.32@xxxxxxxxxxx
- Subject: [HTCondor-users] Windows XP computer matched but idle
Last week I added a dual CPU Windows XP computer to the HTCondor pool,
but have not successfully run jobs on it. The output of condor_status
shows the state as Matched, but activity is Idle when the other CPUs
are Busy. The output from condor_q lists a number of CPUs "reject your
job because of their own requirements" and the CPUs of this WinXP
computer account for two of them and the others are the controller's
and not available. The controller's SchedLog shows entries like
11/19/12 12:22:51 (pid:2128) condor_read() failed: recv(fd=484)
returned -1, errno = 10054 , reading 5 bytes from startd
slot1@xxxxxxxxxx <remote ip:1074> for me.
11/19/12 12:22:51 (pid:2128) IO: Failed to read packet header
11/19/12 12:22:51 (pid:2128) Response problem from startd when
requesting claim slot1@xxxxxxxxxx <remote ip:1074> for me 706.420.
11/19/12 12:22:51 (pid:2128) Failed to send REQUEST_CLAIM to startd
slot1@xxxxxxxxxx <remote ip:1074> for me: CEDAR:6004:failed reading
from socket
11/19/12 12:22:51 (pid:2128) Match record (slot1@xxxxxxxxxx <remote
up:1074> for me, 706.420) deleted
I didn't see any references to that error number, 10054, in the list's
archives. What does the error mean or what is going on here? The
computer is apparently communicating its presence to the pool's
controller or it wouldn't be listed in the first place.
These computers have the Windows firewall disabled but are running MS
Endpoint Protection, but I haven't had any problems with other four
WinXP computers in the pool. I have a, hopefully separate, problem
getting a Win7 computer to be recognized by HTCondor (well, it
appeared once following a condor restart).