Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Condor on XP stops talking to pool manager
- Date: Thu, 1 Sep 2005 11:21:01 +1200
- From: Andrew Mellanby <A.Mellanby@xxxxxxxxxxxxx>
- Subject: [Condor-users] Condor on XP stops talking to pool manager
Hi,
I've installed 28 WindowsXP machines with condor 6.6.10.
After the 1st day of running, 11 have stopped showing up on the condor_status
list.
The MasterLog on some of the affected machines ends with this:
> 8/30 18:17:33 Child 2424 died, but not a daemon -- Ignored
When I check the processes running on the afflicted system, they do appear to
be active, but I can't contact them from the pool manager
> condor_restart -direct 130.195.109.42
> 9/1 10:04:19 TOOL_TIMEOUT_MULTIPLIER is undefined, using default value of 0
> Can't find address for master 130.195.109.42
Is there anything else I can do to get these machines back online without
actually rebooting them ?
thanks
Mel.
8/30 17:17:32 ******************************************************
8/30 17:17:32 ** Condor (CONDOR_MASTER) STARTING UP
8/30 17:17:32 ** C:\Condor\bin\condor_master.exe
8/30 17:17:32 ** $CondorVersion: 6.6.10 Jun 22 2005 $
8/30 17:17:32 ** $CondorPlatform: INTEL-WINNT50 $
8/30 17:17:32 ** PID = 2196
8/30 17:17:32 ******************************************************
8/30 17:17:32 Using config file: C:\Condor\condor_config
8/30 17:17:32 Using local config files: C:\Condor/condor_config.local
8/30 17:17:32 DaemonCore: Command Socket at <130.195.109.42:3022>
8/30 17:17:32 Started DaemonCore process "C:\Condor/bin/condor_startd.exe",
pid and pgroup = 916
8/30 18:17:32 Preen pid is 2424
8/30 18:17:33 DaemonCore: Command received via UDP from host
<130.195.109.42:3182>
8/30 18:17:33 DaemonCore: received command 60001 (DC_PROCESSEXIT), calling
handler (HandleProcessExitCommand())
8/30 18:17:33 Child 2424 died, but not a daemon -- Ignored