Sent from Mail for Windows 10
Jason,
>Assuming you are running a recent version of condor, "condor_q" will
>not show jobs from all users, but "condor_status -schedd" will show
>totals from all users. Does the output of "condor_q -all" show more
>jobs?
[root@rocks7 examples]# condor_status -schedd
Name Machine RunningJobs IdleJobs HeldJobs
rocks7.vbtestcluster.com rocks7.vbtestcluster.com 0 2 0
TotalRunningJobs TotalIdleJobs TotalHeldJobs
Total 0 2 0
[root@rocks7 examples]# condor_q -all
-- Schedd: rocks7.vbtestcluster.com : <10.0.3.15:48687> @ 01/19/18 07:45:13
OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
mahmood CMD: /opt/openmpi/bin/mpirun 1/17 03:04 _ _ 1 1 5.0
1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended
I followed the steps as described in the manual and uncommented the policy. The job is still in idle state. Should I kill it and resubmit or I missed some configurations?
[root@rocks7 examples]# cat condor_config.local.dedicated.resource
######################################################################
##
## condor_config.local.dedicated.resource
##
## This is the default local configuration file for any resources
## that are going to be configured as dedicated resources in your
## Condor pool. If you are going to use Condor's dedicated MPI
## scheduling, you must configure some of your machines as dedicated
## resources, using the settings in this file.
##
## PLEASE READ the discussion on "Configuring Condor for Dedicated
## Scheduling" in the "Setting up Condor for Special Environments"
## section of the Condor Manual for more details.
##
## You should copy this file to the appropriate location and
## customize it for your needs. The file is divided into three main
## parts: settings you MUST customize, settings regarding the policy
## of running jobs on your dedicated resources (you must select a
## policy and uncomment the corresponding expressions), and settings
## you should leave alone, but that must be present for dedicated
## scheduling to work. Settings that are defined here MUST BE
## DEFINED, since they have no default value.
##
######################################################################
######################################################################
######################################################################
## Settings you MUST customize!
######################################################################
######################################################################
## What is the name of the dedicated scheduler for this resource?
## You MUST fill in the correct full hostname where you're running
## the dedicated scheduler, and where users will submit their
## dedicated jobs. The "DedicateScheduler@" part should not be
## changed, ONLY the hostname.
DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxxxxx"
######################################################################
######################################################################
## Policy Settings (You MUST choose a policy and uncomment it)
######################################################################
######################################################################
## There are three basic options for the policy on dedicated
## resources:
## 1) Only run dedicated jobs
## 2) Always run jobs, but prefer dedicated ones
## 3) Always run dedicated jobs, but only allow non-dedicated jobs to
## run on an opportunistic basis.
## You MUST uncomment the set of policy expressions you want to use
## at your site.
##--------------------------------------------------------------------
## 1) Only run dedicated jobs
##--------------------------------------------------------------------
#START = Scheduler =?= $(DedicatedScheduler)
#SUSPEND = False
#CONTINUE = True
#PREEMPT = False
#KILL = False
#WANT_SUSPEND = False
#WANT_VACATE = False
#RANK = Scheduler =?= $(DedicatedScheduler)
##--------------------------------------------------------------------
## 2) Always run jobs, but prefer dedicated ones
##--------------------------------------------------------------------
#START = True
#SUSPEND = False
#CONTINUE = True
#PREEMPT = False
#KILL = False
#WANT_SUSPEND = False
#WANT_VACATE = False
#RANK = Scheduler =?= $(DedicatedScheduler)
##--------------------------------------------------------------------
## 3) Always run dedicated jobs, but only allow non-dedicated jobs to
## run on an opportunistic basis.
##--------------------------------------------------------------------
## Allowing both dedicated and opportunistic jobs on your resources
## requires that you have an opportunistic policy already defined.
## These are the only settings that need to be modified from your
## existing policy expressions to allow dedicated jobs to always run
## without suspending, or ever being preempted (either from activity
## on the machine, or other jobs in the system).
SUSPEND = Scheduler =!= $(DedicatedScheduler) && ($(SUSPEND))
PREEMPT = Scheduler =!= $(DedicatedScheduler) && ($(PREEMPT))
RANK_FACTOR = 1000000
RANK = (Scheduler =?= $(DedicatedScheduler) * $(RANK_FACTOR)) + $(RANK)
START = (Scheduler =?= $(DedicatedScheduler)) || ($(START))
## Note: For everything to work, you MUST set RANK_FACTOR to be a
## larger value than the maximum value your existing rank _expression_
## could possibly evaluate to. RANK is just a floating point value,
## so there's no harm in having a value that's very large.
######################################################################
######################################################################
## Settings you should leave alone, but that must be defined
######################################################################
######################################################################
## Path to the special version of rsh that's required to spawn MPI
## jobs under Condor. WARNING: This is not a replacement for rsh,
## and does NOT work for interactive use. Do not use it directly!
MPI_CONDOR_RSH_PATH = $(LIBEXEC)
## Path to OpenSSH server binary
## Condor uses this to establish a private SSH connection between execute
## machines. It is usually in /usr/sbin, but may be in /usr/local/sbin
CONDOR_SSHD = /usr/sbin/sshd
## Path to OpenSSH keypair generator.
## Condor uses this to establish a private SSH connection between execute
## machines. It is usually in /usr/bin, but may be in /usr/local/bin
CONDOR_SSH_KEYGEN = /usr/bin/ssh-keygen
## This setting puts the DedicatedScheduler attribute, defined above,
## into your machine's classad. This way, the dedicated scheduler
## (and you) can identify which machines are configured as dedicated
## resources.
## Note: as of 8.4.1 this setting is automatic
#STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler
[root@rocks7 examples]# rocks sync host condor rocks7
[root@rocks7 examples]# condor_status -af:h Machine DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxxxxx
Error: Parse error of: DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxxxxx
[root@rocks7 examples]# condor_status -af:h Machine rocks7.vbtestcluster.com
Machine rocks7.vbtestcluster.com
compute-0-0.local undefined
compute-0-0.local undefined
[root@rocks7 examples]# condor_q
-- Schedd: rocks7.vbtestcluster.com : <10.0.3.15:48687> @ 01/19/18 05:22:37
OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
mahmood CMD: /opt/openmpi/bin/mpirun 1/17 03:04 _ _ 1 1 5.0
1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended