[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Windows XP computer matched but idle



Last week I added a dual CPU Windows XP computer to the HTCondor pool, but have not successfully run jobs on it. The output of condor_status shows the state as Matched, but activity is Idle when the other CPUs are Busy. The output from condor_q lists a number of CPUs "reject your job because of their own requirements" and the CPUs of this WinXP computer account for two of them and the others are the controller's and not available. The controller's SchedLog shows entries like
11/19/12 12:22:51 (pid:2128) condor_read() failed: recv(fd=484)  
returned -1, errno = 10054 , reading 5 bytes from startd  
slot1@xxxxxxxxxx  <remote ip:1074> for me.
11/19/12 12:22:51 (pid:2128) IO: Failed to read packet header
11/19/12 12:22:51 (pid:2128) Response problem from startd when requesting claim slot1@xxxxxxxxxx <remote ip:1074> for me 706.420. 11/19/12 12:22:51 (pid:2128) Failed to send REQUEST_CLAIM to startd slot1@xxxxxxxxxx <remote ip:1074> for me: CEDAR:6004:failed reading from socket 11/19/12 12:22:51 (pid:2128) Match record (slot1@xxxxxxxxxx <remote up:1074> for me, 706.420) deleted
I didn't see any references to that error number, 10054, in the list's  
archives. What does the error mean or what is going on here? The  
computer is apparently communicating its presence to the pool's  
controller or it wouldn't be listed in the first place.
These computers have the Windows firewall disabled but are running MS  
Endpoint Protection, but I haven't had any problems with other four  
WinXP computers in the pool. I have a, hopefully separate, problem  
getting a Win7 computer to be recognized by HTCondor (well, it  
appeared once following a condor restart).