Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Other machines don't accept jobs!
- Date: Thu, 12 Oct 2006 22:39:22 +0800 (PHT)
- From: leo@xxxxxxxxxxxxxxxxxxxxx
- Subject: [Condor-users] Other machines don't accept jobs!
Hi all,
I have 3 machines on my pool. node1 (central manager), node2 and node 3.
(all can execute and can submit jobs).
Here's what happened:
jobs submitted from node3 can be executed on all 3 machines.
jobs submitted from node1 can NOT be executed on node3, but okey on node1
& node2.
jobs submitted from node2 can NOT be executed on node3, but okey also on
node1 and node2.
Log file has the following:
###########################################
022 (067.000.000) 10/12 21:58:15 Job disconnected, attempting to reconnect
Socket between submit and execute hosts closed unexpectedly
Trying to reconnect to node3
<10.0.40.112:32772>
...
024 (067.000.000) 10/12 21:58:15 Job reconnection failed
Job not found at execution machine
Can not reconnect to node3, rescheduling job
###########################################
condor_q -analyze: (shows that...)
###########################
node3 match the job but reject for unknown reasons
############################
I would very much appreciate for your help.
Leo