Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] jobs stuck in queue
- Date: Fri, 19 Aug 2011 18:31:57 -0400
- From: "David J. Herzfeld" <herzfeldd@xxxxxxxxx>
- Subject: Re: [Condor-users] jobs stuck in queue
Hi:
On Fri, 2011-08-19 at 17:36 -0300, Fabricio Cannini wrote:
> *nodes:*
> CONDOR_HOST = master
> UID_DOMAIN = internal.domain
> FILESYSTEM_DOMAIN = internal.domain
> SEC_DEFAULT_NEGOTIATION = OPTIONAL
> ALLOW_READ = $(CONDOR_HOST),172.17.8.*
> ALLOW_WRITE = $(CONDOR_HOST),172.17.8.*
> ALLOW_NEGOTIATOR = $(CONDOR_HOST)
> ALLOW_CONFIG = $(CONDOR_HOST),$(FULL_HOSTNAME)
> ENABLE_RUNTIME_CONFIG = True
> ENABLE_PERSISTENT_CONFIG = True
> PERSISTENT_CONFIG_DIR = /etc/condor/config.d
> SETTABLE_ATTRS_CONFIG = *
> USE_NFS = True
> DEFAULT_DOMAIN_NAME = internal.domain
> ALLOW_DAEMON = *@$(CONDOR_HOST)
> SOFT_UID_DOMAIN = TRUE
> START = TRUE
> TRUST_UID_DOMAIN = TRUE
> STARTD_EXPRS=$(STARTD_EXPRS), DedicatedScheduler, ParallelSchedulingGroup
> SCHEDD_NAME = $(CONDOR_HOST)
> Any tips to what may (not) be going on are very, very, veeeeery welcome.
It doesn't look like you defined DedicatedScheduler on your execute
nodes. Likely needs to look like:
DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxxx"
Without this attribute, your scheduler will not match parallel jobs with
dedicated execute nodes.
Take a look at
http://www.cs.wisc.edu/condor/manual/v7.6/3_13Setting_Up.html#SECTION0041310100000000000000
for more information.
Best of luck,
DJH