Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Idle jobs on Windows 2003 server
- Date: Sun, 09 Jul 2006 13:57:37 +0200
- From: Guy Tel-Zur <tel-zur@xxxxxxxxxxxx>
- Subject: [Condor-users] Idle jobs on Windows 2003 server
Jobs I submit from a Microsoft Windows 2003 Server with Condor 6.7.16
(configured as a submit node only) are idle all the time and never start
running.
Below tail output of SchedLog and MasterLog.
Any comments will be appreciated!
-Guy
SchedLog:
=======
7/9 13:42:31 (pid:476) Sent RELEASE_CLAIM to startd on <132.72.69.87:1049>
7/9 13:42:31 (pid:476) Match record (<132.72.69.87:1049>, 749, 2) deleted
7/9 13:42:31 (pid:476) condor_read(): recv() returned -1, errno = 10054,
assuming failure.
7/9 13:42:31 (pid:476) IO: Failed to read packet header
7/9 13:42:31 (pid:476) Response problem from startd on
<132.72.69.83:1048> (match <132.72.69.83:1048>#2215658461).
7/9 13:42:31 (pid:476) Sent RELEASE_CLAIM to startd on <132.72.69.83:1048>
7/9 13:42:31 (pid:476) Match record (<132.72.69.83:1048>, 749, 1) deleted
7/9 13:42:31 (pid:476) condor_read(): recv() returned -1, errno = 10054,
assuming failure.
7/9 13:42:32 (pid:476) IO: Failed to read packet header
7/9 13:42:32 (pid:476) Response problem from startd on
<132.72.69.88:1040> (match <132.72.69.88:1040>#1082933760).
7/9 13:42:32 (pid:476) Sent RELEASE_CLAIM to startd on <132.72.69.88:1040>
7/9 13:42:32 (pid:476) Match record (<132.72.69.88:1040>, 749, 4) deleted
7/9 13:42:34 (pid:476) Sent ad to central manager for gtelzur@xxxxxxxxx
7/9 13:42:34 (pid:476) Sent ad to 1 collectors for gtelzur@xxxxxxxxx
MasterLog:
========
7/4 13:36:44 ** Condor (CONDOR_MASTER) STARTING UP
7/4 13:36:44 ** C:\condor\bin\condor_master.exe
7/4 13:36:44 ** $CondorVersion: 6.7.16 Feb 2 2006 $
7/4 13:36:44 ** $CondorPlatform: INTEL-WINNT50 $
7/4 13:36:44 ** PID = 436
7/4 13:36:44 ******************************************************
7/4 13:36:44 Using config file: C:\condor\condor_config
7/4 13:36:44 Using local config files: C:\condor/condor_config.local
7/4 13:36:44 DaemonCore: Command Socket at <132.72.55.51:1039>
7/4 13:36:45 Started DaemonCore process
"C:\condor/bin/condor_schedd.exe", pid and pgroup = 476
7/4 14:36:45 Preen pid is 3924
7/4 14:36:45 DaemonCore: Command received via UDP from host
<132.72.55.51:1513>
7/4 14:36:45 DaemonCore: received command 60011 (DC_NOP), calling
handler (handle_nop())
7/4 14:36:45 Child 3924 died, but not a daemon -- Ignored
7/5 14:36:45 Preen pid is 27532
7/5 14:36:46 DaemonCore: Command received via UDP from host
<132.72.55.51:1112>
7/5 14:36:46 DaemonCore: received command 60011 (DC_NOP), calling
handler (handle_nop())
7/5 14:36:46 Child 27532 died, but not a daemon -- Ignored
7/6 14:36:45 Preen pid is 53016
7/6 14:36:45 DaemonCore: Command received via UDP from host
<132.72.55.51:3871>
7/6 14:36:45 DaemonCore: received command 60011 (DC_NOP), calling
handler (handle_nop())
7/6 14:36:45 Child 53016 died, but not a daemon -- Ignored
7/7 14:36:45 Preen pid is 49396
7/7 14:36:46 DaemonCore: Command received via UDP from host
<132.72.55.51:4737>
7/7 14:36:46 DaemonCore: received command 60011 (DC_NOP), calling
handler (handle_nop())
7/7 14:36:46 Child 49396 died, but not a daemon -- Ignored
7/8 14:36:45 Preen pid is 40632
7/8 14:36:45 DaemonCore: Command received via UDP from host
<132.72.55.51:2785>
7/8 14:36:45 DaemonCore: received command 60011 (DC_NOP), calling
handler (handle_nop())
7/8 14:36:45 Child 40632 died, but not a daemon -- Ignored