Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Upgrade problem
- Date: Mon, 12 Jun 2006 17:13:35 -0400
- From: Cliff Padgett <cwpadget@xxxxxxxx>
- Subject: [Condor-users] Upgrade problem
Hi, I just upgraded a windows cluster from condor 6.6.11 to 6.7.19
(trying to use the new groups abilities). All the job on the cluster
use a dedicated scheduler and are MPI. Any Idea why jobs will not run
now? They just constantly cycle from running to Idle. The shadow log
has the following errors, but I'm not sure what this means?
6/12 17:21:29 Using config file: C:\Condor\condor_config
6/12 17:21:29 Using local config files: C:\Condor/condor_config.local
6/12 17:21:29 DaemonCore: Command Socket at <10.0.0.1:4373>
6/12 17:21:29 Initializing a MPI shadow for job 8666.0
6/12 17:21:29 (8663.0) (4896): condor_read(): recv() returned -1, errno
= 10054, assuming failure.
6/12 17:21:29 (8663.0) (4896): IO: Failed to read packet header
6/12 17:21:29 (8663.0) (4896): ERROR "Can no longer talk to
condor_starter <10.0.0.10:1040>" at line 93 in file
..\src\condor_shadow.V6.1\NTreceivers.C
6/12 17:21:29 (8662.0) (4532): condor_read(): timeout reading buffer.
6/12 17:21:29 (8662.0) (4532): IO: Failed to read packet header
6/12 17:21:29 (8665.0) (1624): condor_read(): recv() returned -1, errno
= 10054, assuming failure.
6/12 17:21:29 (8665.0) (1624): IO: Failed to read packet header
6/12 17:21:29 (8665.0) (1624): ERROR "Can no longer talk to
condor_starter <10.0.0.2:1040>" at line 93 in file
..\src\condor_shadow.V6.1\NTreceivers.C