Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] jobs fail to run, with "Warning: Found no submitters"
- Date: Tue, 16 Aug 2005 12:44:08 -0400
- From: Jamie Rollins <jrollins@xxxxxxxxxxxxxxxxx>
- Subject: [Condor-users] jobs fail to run, with "Warning: Found no submitters"
Hello. I've been struggling with a problem that is basically identical to the
one described in this post from last year:
https://lists.cs.wisc.edu/archive/condor-users/pre-2004-June/msg01340.shtml
The problem is that I can submit jobs, but whatever jobs are submitted are
rejected by all available nodes.
My cluster consists of one dual-cpu head node, and three diskless client nodes:
------------------------
~> condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
node1.cluster LINUX X86_64 Unclaimed Idle 0.950 435[?????]
node2.cluster LINUX X86_64 Unclaimed Idle 1.120 435 0+00:53:42
node3.cluster LINUX X86_64 Unclaimed Idle 1.000 435 0+01:00:47
vm1@xxxxxxxxx LINUX X86_64 Owner Idle 1.000 1002 4+20:07:37
vm2@xxxxxxxxx LINUX X86_64 Unclaimed Idle 0.210 1002 0+00:00:00
Machines Owner Claimed Unclaimed Matched Preempting
X86_64/LINUX 5 1 0 4 0 0
Total 5 1 0 4 0 0
------------------------
The Condor setup is very simple, pretty much default. The head node has the
following condo_config.local file:
------------------------
NETWORK_INTERFACE = 10.0.0.1
DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, STARTD
------------------------
and the other nodes are using the
<release_dir>/etc/examples/condor_config.local.dedicated.resource file which
specifies the DedicatedScheduler as the head node.
I have made a single executable to calculate pi to 10000 digits (which works
fine normally), which I am trying to submit with the following command file:
------------------------
Executable = pi2
output = pi2.out
Log = pi2.log
Universe = vanilla
Queue
------------------------
The result is the following:
------------------------
~> condor_q -analyze
Warning: Found no submitters
-- Submitter: zajos.cluster : <10.0.0.1:44160> : zajos.cluster
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
---
012.000: Run analysis summary. Of 5 machines,
0 are rejected by your job's requirements
3 reject your job because of their own requirements
0 match but are serving users with a better priority in the pool
2 match but reject the job for unknown reasons
0 match but will not currently preempt their existing job
0 are available to run your job
1 jobs; 1 idle, 0 running, 0 held
------------------------
Does any one have any idea what's going wrong. I'm wondering what types of
misconfigurations to look for, or ways in which I can more specifically debug
what's going on. Unfortunately the tread mentioned above ended with a phone
call instead of a posting to the list. Any help would be most appreciated.
Thanks.
jamie.